官术网_书友最值得收藏!

  • Applied Supervised Learning with R
  • Karthik Ramasubramanian Jojo Moolayil
  • 430字
  • 2021-06-11 13:22:30

Defining the Problem Statement

If you recollect the data we explored in Chapter 1, R for Advanced Analytics, bank marketing data, we have a dataset that captures the telemarketing campaigns conducted by a bank to attract customers.

A large multinational bank is designing a marketing campaign to achieve its growth target by enticing customers for bank deposits. The campaign has been ineffective in luring customers, and the marketing team wants to understand how the campaign can be improved to achieve the growth targets.

We can reframe the problem from the business stakeholders' perspective and try to see what kind of solution would best fit here.

Problem-Designing Artifacts

Just like there are several frameworks, templates, and artifacts for software engineering and other industrial projects, data science and business analytics projects can also be effectively represented using industry standard artifacts. Some popular choices are available from consulting giants such as McKinsey, BCG, and decision sciences giants such as Mu Sigma. We will use a popular framework based on the Minto Pyramid principle called Situation - Complication -Question Analysis (SCQ).

Let's try defining the problem statement in the following construct:

  • Situation: Define the current situation. We can simplify this by answering the question—what happened?

    A large multinational bank is designing a marketing campaign to achieve its growth target by enticing customers for bank deposits. The campaign has been ineffective in luring customers, and the marketing team wants to understand how the campaign can be improved to achieve the growth targets.

In the previous section, we saw a hypothetical business problem framed for the banking data's use case. Though this might be different in reality, we are definitely trying to solve a valid use case. By representing the problem statement in the format demonstrated as in the previous format, we have a clear area to focus on and solve. This solves the first step in the life cycle of a typical data science use case. The second step is data gathering, which we explored in the previous chapter. We will refer to the same dataset provided by UCI machine learning repository at https://archive.ics.uci.edu/ml/datasets/Bank%20Marketing.

Note

[Moro et al., 2014] S. Moro, P. Cortez, and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014.

This brings us to the final step: EDA. In this use case, we want to understand the various factors that are leading to the poor performance of the campaign. Before we delve into the actual exercise, let's take a moment to understand the concept of EDA in a more intuitive way.

主站蜘蛛池模板: 宝鸡市| 策勒县| 邻水| 驻马店市| 木兰县| 陇南市| 巴东县| 四平市| 仲巴县| 莲花县| 通州区| 漯河市| 乌兰察布市| 新乐市| 云霄县| 汨罗市| 阜阳市| 新安县| 揭西县| 桂阳县| 曲沃县| 德清县| 南阳市| 钟祥市| 化州市| 独山县| 新郑市| 根河市| 乌鲁木齐县| 古交市| 南木林县| 壶关县| 唐山市| 巫山县| 淮南市| 砚山县| 永泰县| 江华| 襄樊市| 阿坝| 资源县|