官术网_书友最值得收藏!

There's more...

We have seen a few ways to explore data, both statistically and visually. There are quite a few libraries in Python that you can use to visualize your data. One of the most widely used of these is ggplot. Before we look at a few commands, let's learn how ggplot works.

There are seven layers of grammatical elements in ggplot, out of which, first three layers are mandatory:

  • Data
  • Aesthetics
  • Geometrics
  • Facets
  • Statistics
  • Coordinates
  • Theme

You will often start by providing a dataset to ggplot(). Then, you provide an aesthetic mapping with the aes() function to map the variables to the and y axes. With aes(), you can also set the color, size, shape, and position of the charts. You then add the type of geometric shape you want with functions such as geom_point() or geom_histogram(). You can also add various options, such as plotting statistical summaries, faceting, visual themes, and coordinate systems.

The following code is an extension to what we have used already in this chapter, so we will directly delve into the ggplot code here:

f = pd.melt(housepricesdata, id_vars=['SalePrice'],value_vars= numerical_features[0:9])
ggplot(f,aes('value', 'SalePrice')) + geom_point(color='orange') + facet_wrap('variable',scales='free')

The preceding code generates the following chart:

Similarly, in order to view the density plot for the numerical variables, we can execute the following code:

f_1 = pd.melt(housepricesdata, value_vars=numerical_features[0:9])
ggplot(f_1, aes('value')) + geom_density(color="red") + facet_wrap('variable',scales='free')

The plot shows us the univariate density plot for each of our numerical variables. The geom_density() computes and draws a kernel density estimate, which is a smoothed version of the histogram:

主站蜘蛛池模板: 泽州县| 阳原县| 平乐县| 平谷区| 珲春市| 泰来县| 樟树市| 徐汇区| 福海县| 碌曲县| 钟祥市| 梅河口市| 会宁县| 耿马| 双流县| 江源县| 岑巩县| 内乡县| 新闻| 河东区| 衡东县| 体育| 浏阳市| 疏勒县| 武穴市| 丹棱县| 久治县| 龙泉市| 天祝| 灵璧县| 马龙县| 南木林县| 高雄县| 平顺县| 芦山县| 南澳县| 凭祥市| 砚山县| 仁怀市| 永德县| 色达县|