官术网_书友最值得收藏!

Data analysis and visualization

In order to understand the underlying form of the data, the relationship between the features and response, and more insights, we can use different types of visualization. To understand the relationship between the advertising data features and response, we are going to use a scatterplot.

In order to make different types of visualizations of your data, you can use Matplotlib (https://matplotlib.org/), which is a Python 2D library for making visualizations. To get Matplotlib, you can follow their installation instructions at: https://matplotlib.org/users/installing.html.

Let's import the visualization library Matplotlib:

import matplotlib.pyplot as plt

# The next line will allow us to make inline plots that could appear directly in the notebook
# without poping up in a different window
%matplotlib inline

Now, let's use a scatterplot to visualize the relationship between the advertising data features and response variable:

fig, axs = plt.subplots(1, 3, sharey=True)

# Adding the scatterplots to the grid
advertising_data.plot(kind='scatter', x='TV', y='sales', ax=axs[0], figsize=(16, 8))
advertising_data.plot(kind='scatter', x='radio', y='sales', ax=axs[1])
advertising_data.plot(kind='scatter', x='newspaper', y='sales', ax=axs[2])

Output:

Figure 1: Scatter plot for understanding the relationship between the advertising data features and the response variable

Now, we need to see how the ads will help increase the sales. So, we need to ask ourselves a couple of questions about that. Worthwhile questions to ask will be something like the relationship between the ads and sales, which kind of ads contribute more to the sales, and the approximate effect of each type of ad on the sales. We will try to answer such questions using a simple linear model.

主站蜘蛛池模板: 枣阳市| 常州市| 公安县| 家居| 濮阳县| 花莲市| 龙南县| 滨州市| 遂溪县| 东莞市| 镇平县| 会同县| 沅陵县| 津市市| 兴化市| 江源县| 东港市| 通化市| 庆城县| 盐城市| 共和县| 安丘市| 凤冈县| 武清区| 通道| 南川市| 阳西县| 铜梁县| 嘉黎县| 昭平县| 衡东县| 克什克腾旗| 哈巴河县| 杭锦旗| 汝州市| 东方市| 秀山| 盐边县| 新巴尔虎左旗| 左权县| 彰武县|