官术网_书友最值得收藏!

Loading the Data into Jupyter Using a Pandas DataFrame

Oftentimes, data is stored in tables, which means it can be saved as a comma-separated variable (CSV) file. This format, and many others, can be read into Python as a DataFrame object, using the Pandas library. Other common formats include tab-separated variable (TSV), SQL tables, and JSON data structures. Indeed, Pandas has support for all of these. In this example, however, we are not going to load the data this way because the dataset is available directly through scikit-learn.

An important part after loading data for analysis is ensuring that it's clean. For example, we would generally need to deal with missing data and ensure that all columns have the correct datatypes. The dataset we use in this section has already been cleaned, so we will not need to worry about this. However, we'll see messier data in the second chapter and explore techniques for dealing with it.
主站蜘蛛池模板: 昔阳县| 鄂托克前旗| 卢氏县| 桓台县| 武定县| 寿阳县| 于都县| 柳州市| 田林县| 河池市| 临潭县| 三河市| 西乌珠穆沁旗| 贵港市| 安宁市| 定结县| 易门县| 崇信县| 竹北市| 高雄县| 达日县| 富源县| 巴东县| 习水县| 清远市| 建阳市| 大宁县| 临夏市| 东明县| 柏乡县| 赤水市| 公主岭市| 西贡区| 福贡县| 盱眙县| 收藏| 上林县| 朝阳区| 壤塘县| 武强县| 泰州市|