官术网_书友最值得收藏!

Our First Analysis - The Boston Housing Dataset

So far, this chapter has focused on the features and basic usage of Jupyter. Now, we'll put this into practice and do some data exploration and analysis.

The dataset we'll look at in this section is the so-called Boston housing dataset. It contains US census data concerning houses in various areas around the city of Boston. Each sample corresponds to a unique area and has about a dozen measures. We should think of samples as rows and measures as columns. The data was first published in 1978 and is quite small, containing only about 500 samples.

Now that we know something about the context of the dataset, let's decide on a rough plan for the exploration and analysis. If applicable, this plan would accommodate the relevant question(s) under study. In this case, the goal is not to answer a question but to instead show Jupyter in action and illustrate some basic data analysis methods.

Our general approach to this analysis will be to do the following:

  • Load the data into Jupyter using a Pandas DataFrame
  • Quantitatively understand the features
  • Look for patterns and generate questions
  • Answer the questions to the problems
主站蜘蛛池模板: 历史| 利川市| 西青区| 揭阳市| 漯河市| 灵川县| 甘孜县| 靖江市| 普兰县| 灵武市| 福泉市| 泰顺县| 鄯善县| 宿松县| 海盐县| 方山县| 宝坻区| 长沙市| 东乡县| 嵊泗县| 玉溪市| 应城市| 泾阳县| 深水埗区| 侯马市| 平湖市| 改则县| 郴州市| 从化市| 廊坊市| 沾益县| 盱眙县| 台南市| 石泉县| 涿州市| 武宣县| 钟祥市| 依兰县| 清河县| 绥滨县| 阿拉善右旗|