官术网_书友最值得收藏!

Loading data from files into a DataFrame

The pandas library provides facilities for easy retrieval of data from a variety of data sources as pandas objects. As a quick example, let's examine the ability of pandas to load data in CSV format.

This example will use a file provided with the code from this book, data/goog.csv, and the contents of the file represent time series financial information for the Google stock.

The following statement uses the operating system (from within Jupyter Notebook or IPython) to display the content of this file. Which command you will need to use depends on your operating system:

This information can be easily imported into a DataFrame using the pd.read_csv() function:

pandas has no idea that the first column in the file is a date and has treated the contents of the date field as a string. This can be verified using the following pandas statement, which shows the type of the Date column as a string:

The parse_dates parameter of the pd.read_csv() function to guide pandas on how to convert data directly into a pandas date object. The following informs pandas to convert the content of the Date column into actual TimeStamp objects:

If we check whether it worked, we see that the date is a Timestamp:

Unfortunately, this has not used the date field as the index for the data frame. Instead, it uses the default zero-based integer index labels:

Note that this is now a RangeIndex, where in previous versions of pandas it would have been an integer index. We'll examine this difference later in the book.

This can be fixed using the index_col parameter of the pd.read_csv() function to specify which column in the file should be used as the index:

And the index now is a DateTimeIndex, which lets us look up rows using dates.

主站蜘蛛池模板: 临邑县| 昌图县| 富锦市| 左云县| 伊春市| 安化县| 通渭县| 项城市| 民丰县| 牡丹江市| 三亚市| 长春市| 蓬莱市| 蒙城县| 金乡县| 铜鼓县| 永顺县| 九江市| 九龙县| 湘西| 江源县| 岢岚县| 石柱| 西林县| 龙川县| 林州市| 大同县| 房产| 铜川市| 宁津县| 盖州市| 牟定县| 海伦市| 克山县| 郁南县| 高阳县| 信宜市| 吕梁市| 蓬安县| 阿拉善左旗| 黄山市|