官术网_书友最值得收藏!

Activity: Creating a Histogram and Explaining its Features

Scenario

Histograms are useful when you want to find the peak and spread in a distribution. For example, suppose that a company wants to see what its client age distribution looks like. A two-dimensional distribution can show relationships; for example, one can create a scatter plot of the incomes and ages of credit card holders.

Aim

To create and analyze histograms for the given dataset.

Prerequisites

You should be able to use ggplot2 to create a histogram.


This is an empty code, wherein the libraries are already loaded. You will be writing your code here.

Steps for Completion

  1. Use the template code and load the required datasets.
  2. Create the histogram for two cities.
  3. Analyze and compare two histograms to determine the point of difference.

Outcome

Two histograms should be created and compared. The complete code is as follows:

df_t <- read.csv("data/historical-hourly-weather-data/temperature.csv")
ggplot(df_t,aes(x=Vancouver))+geom_histogram()
ggplot(df_t,aes(x=Miami))+geom_histogram()

Refer to the complete code at https://goo.gl/tu7t4y.

Take a look at the following output histogram:

From the preceding plot, we can determine the following information:

  • Vancouver's maximum temperature is around 280.
  • It ranges between 260 and 300.
  • It's a right-skewed distribution.

Take a look at the following output histogram:

From the preceding plot, we can determine the following information:

  • Miami's maximum temperature is around 300
  • It ranges between 280 and 308
  • It's a left-skewed distribution

Differences

  1. Miami's temperature plot is skewed to the right, while Vancouver's is to the left.
  2. The maximum temperature is higher for Miami.
主站蜘蛛池模板: 稷山县| 成武县| 嘉义市| 永和县| 石泉县| 阳城县| 钟祥市| 英山县| 孝昌县| 怀集县| 宜宾县| 红河县| 图木舒克市| 开江县| 丰原市| 呼图壁县| 青海省| 靖西县| 海南省| 建湖县| 定边县| 昂仁县| 陆河县| 永修县| 东宁县| 建瓯市| 长治市| 乌拉特前旗| 呼伦贝尔市| 诸暨市| 临西县| 霍林郭勒市| 靖边县| 丰都县| 抚远县| 商洛市| 卢湾区| 象山县| 开化县| 城口县| 阜南县|