官术网_书友最值得收藏!

Loading data into memory – viewing and managing with ease using pandas

First, we will need to load data into memory so that Python can interact with it. Pandas will be our data management and manipulation library:

# load data into Pandas
import pandas as pd
df = pd.read_csv("./data/iris.csv")

Let's use some built-in pandas features to do sanity checks on our data load and make sure that we've loaded everything properly. First, we use the .shape attribute to check the size of the data printed (as rows and columns). Next, we sanity check the contents of the DataFrame with the .head() method, which returns the first five lines in a new and smaller DataFrame for easy viewing. Finally, we can use the .describe() method to show some summary statistics for each feature. 

Pandas has many more sanity check and quick view features. For example, .tail() will return the final five lines of the data. Becoming proficient in pandas is undoubtedly worth the time investment. The dedicated chapter that appears later in the book is a good place to start, as well as the essential basic functionality (https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html) page on the pandas documentation site.
# sanity check with Pandas
print("shape of data in (rows, columns) is " + str(df.shape))
print(df.head())
print(df.describe().transpose())

You will see the following output after executing the preceding code:

主站蜘蛛池模板: 苏尼特左旗| 沙田区| 和平县| 南木林县| 广昌县| 铁力市| 手机| 扶绥县| 黄浦区| 江川县| 苏州市| 紫金县| 临沧市| 鞍山市| 镇巴县| 周宁县| 于都县| 普宁市| 奎屯市| 石狮市| 凌云县| 文化| 象山县| 汝南县| 乐山市| 禄丰县| 香格里拉县| 崇礼县| 高平市| 东光县| 长顺县| 道真| 翁牛特旗| 阿拉善盟| 米脂县| 鲁山县| 长海县| 长宁县| 晋江市| 元阳县| 佳木斯市|