官术网_书友最值得收藏!

There's more...

We have seen a few ways to explore data, both statistically and visually. There are quite a few libraries in Python that you can use to visualize your data. One of the most widely used of these is ggplot. Before we look at a few commands, let's learn how ggplot works.

There are seven layers of grammatical elements in ggplot, out of which, first three layers are mandatory:

  • Data
  • Aesthetics
  • Geometrics
  • Facets
  • Statistics
  • Coordinates
  • Theme

You will often start by providing a dataset to ggplot(). Then, you provide an aesthetic mapping with the aes() function to map the variables to the and y axes. With aes(), you can also set the color, size, shape, and position of the charts. You then add the type of geometric shape you want with functions such as geom_point() or geom_histogram(). You can also add various options, such as plotting statistical summaries, faceting, visual themes, and coordinate systems.

The following code is an extension to what we have used already in this chapter, so we will directly delve into the ggplot code here:

f = pd.melt(housepricesdata, id_vars=['SalePrice'],value_vars= numerical_features[0:9])
ggplot(f,aes('value', 'SalePrice')) + geom_point(color='orange') + facet_wrap('variable',scales='free')

The preceding code generates the following chart:

Similarly, in order to view the density plot for the numerical variables, we can execute the following code:

f_1 = pd.melt(housepricesdata, value_vars=numerical_features[0:9])
ggplot(f_1, aes('value')) + geom_density(color="red") + facet_wrap('variable',scales='free')

The plot shows us the univariate density plot for each of our numerical variables. The geom_density() computes and draws a kernel density estimate, which is a smoothed version of the histogram:

主站蜘蛛池模板: 简阳市| 武安市| 乐陵市| 福鼎市| 桑日县| 锦州市| 天水市| 阜新市| 万源市| 廉江市| 余江县| 衡水市| 衡山县| 常宁市| 治多县| 博罗县| 闻喜县| 桑植县| 青铜峡市| 武宣县| 金秀| 齐齐哈尔市| 台中市| 商都县| 永平县| 黎城县| 信丰县| 安福县| 竹溪县| 苍南县| 祁阳县| 梧州市| 广汉市| 汝城县| 信阳市| 安乡县| 辽阳县| 芒康县| 自贡市| 平舆县| 呼和浩特市|