官术网_书友最值得收藏!

Our First Analysis - The Boston Housing Dataset

So far, this chapter has focused on the features and basic usage of Jupyter. Now, we'll put this into practice and do some data exploration and analysis.

The dataset we'll look at in this section is the so-called Boston housing dataset. It contains US census data concerning houses in various areas around the city of Boston. Each sample corresponds to a unique area and has about a dozen measures. We should think of samples as rows and measures as columns. The data was first published in 1978 and is quite small, containing only about 500 samples.

Now that we know something about the context of the dataset, let's decide on a rough plan for the exploration and analysis. If applicable, this plan would accommodate the relevant question(s) under study. In this case, the goal is not to answer a question but to instead show Jupyter in action and illustrate some basic data analysis methods.

Our general approach to this analysis will be to do the following:

  • Load the data into Jupyter using a Pandas DataFrame
  • Quantitatively understand the features
  • Look for patterns and generate questions
  • Answer the questions to the problems
主站蜘蛛池模板: 新丰县| 香港| 滨海县| 绥宁县| 凤凰县| 剑川县| 铜梁县| 原阳县| 舞阳县| 金坛市| 平山县| 陆良县| 余干县| 长乐市| 兰西县| 桃园市| 日土县| 忻城县| 高淳县| 崇左市| 泾源县| 枣强县| 崇仁县| 蕉岭县| 石林| 公安县| 雅江县| 平谷区| 阜南县| 溆浦县| 子洲县| 苗栗市| 石屏县| 山丹县| 石嘴山市| 二手房| 田阳县| 思南县| 北川| 甘孜| 玉林市|