官术网_书友最值得收藏!

Our First Analysis - The Boston Housing Dataset

So far, this chapter has focused on the features and basic usage of Jupyter. Now, we'll put this into practice and do some data exploration and analysis.

The dataset we'll look at in this section is the so-called Boston housing dataset. It contains US census data concerning houses in various areas around the city of Boston. Each sample corresponds to a unique area and has about a dozen measures. We should think of samples as rows and measures as columns. The data was first published in 1978 and is quite small, containing only about 500 samples.

Now that we know something about the context of the dataset, let's decide on a rough plan for the exploration and analysis. If applicable, this plan would accommodate the relevant question(s) under study. In this case, the goal is not to answer a question but to instead show Jupyter in action and illustrate some basic data analysis methods.

Our general approach to this analysis will be to do the following:

  • Load the data into Jupyter using a Pandas DataFrame
  • Quantitatively understand the features
  • Look for patterns and generate questions
  • Answer the questions to the problems
主站蜘蛛池模板: 青龙| 缙云县| 明溪县| 阿尔山市| 信丰县| 东乌珠穆沁旗| 道真| 梁平县| 凤冈县| 中宁县| 元阳县| 西乌珠穆沁旗| 凤凰县| 益阳市| 景德镇市| 河东区| 重庆市| 祁门县| 施甸县| 彭州市| 迁西县| 丹江口市| 鄂托克前旗| 耒阳市| 梅河口市| 甘德县| 沾化县| 库伦旗| 晴隆县| 南皮县| 长丰县| 驻马店市| 瓮安县| 新竹县| 永和县| 永寿县| 彩票| 沧州市| 霸州市| 梁河县| 福贡县|