官术网_书友最值得收藏!

Activity: Creating a Histogram and Explaining its Features

Scenario

Histograms are useful when you want to find the peak and spread in a distribution. For example, suppose that a company wants to see what its client age distribution looks like. A two-dimensional distribution can show relationships; for example, one can create a scatter plot of the incomes and ages of credit card holders.

Aim

To create and analyze histograms for the given dataset.

Prerequisites

You should be able to use ggplot2 to create a histogram.


This is an empty code, wherein the libraries are already loaded. You will be writing your code here.

Steps for Completion

  1. Use the template code and load the required datasets.
  2. Create the histogram for two cities.
  3. Analyze and compare two histograms to determine the point of difference.

Outcome

Two histograms should be created and compared. The complete code is as follows:

df_t <- read.csv("data/historical-hourly-weather-data/temperature.csv")
ggplot(df_t,aes(x=Vancouver))+geom_histogram()
ggplot(df_t,aes(x=Miami))+geom_histogram()

Refer to the complete code at https://goo.gl/tu7t4y.

Take a look at the following output histogram:

From the preceding plot, we can determine the following information:

  • Vancouver's maximum temperature is around 280.
  • It ranges between 260 and 300.
  • It's a right-skewed distribution.

Take a look at the following output histogram:

From the preceding plot, we can determine the following information:

  • Miami's maximum temperature is around 300
  • It ranges between 280 and 308
  • It's a left-skewed distribution

Differences

  1. Miami's temperature plot is skewed to the right, while Vancouver's is to the left.
  2. The maximum temperature is higher for Miami.
主站蜘蛛池模板: 遂川县| 台北县| 株洲县| 渭南市| 田阳县| 民权县| 游戏| 县级市| 郑州市| 太仆寺旗| 外汇| 孝昌县| 昌平区| 淮滨县| 台北县| 枣强县| 渝北区| 镇巴县| 湄潭县| 嘉荫县| 钦州市| 宜君县| 阿克苏市| 盖州市| 濮阳市| 宁南县| 威宁| 伊金霍洛旗| 华安县| 洞头县| 文登市| 连云港市| 弋阳县| 比如县| 沅陵县| 中牟县| 武川县| 阳曲县| 西畴县| 新乡县| 无锡市|