官术网_书友最值得收藏!

The pandas Series

The pandas Series is the base data structure of pandas. A series is similar to a NumPy array, but it differs by having an index, which allows for much richer lookup of items instead of just a zero-based array index value.

The following creates a series from a Python list.:

The output consists of two columns of information. The first is the index and the second is the data in the Series. Each row of the output represents the index label (in the first column) and then the value associated with that label.

Because this Series was created without specifying an index (something we will do next), pandas automatically creates an integer index with labels starting at 0 and increasing by one for each data item.

The values of a Series object can then be accessed by using the [] operator, passing the label for the value you require. The following gets the value for the label 1:

This looks very much like normal array access in many programming languages. But as we will see, the index does not have to start at 0, nor increment by one, and can be many other data types than just an integer. This ability to associate flexible indexes in this manner is one of the great superpowers of pandas.

Multiple items can be retrieved by specifying their labels in a Python list. The following retrieves the values at labels 1 and 3:

A Series object can be created with a user-defined index by using the index parameter and specifying the index labels. The following creates a Series with the same values but with an index consisting of string values:

Data in the Series object can now be accessed by those alphanumeric index labels. The following retrieves the values at index labels 'a' and 'd':

It is still possible to refer to the elements of this Series object by their numerical 0-based position. :

We can examine the index of a Series using the .index property:

The index is itself actually a pandas object, and this output shows us the values of the index and the data type used for the index. In this case, note that the type of the data in the index (referred to as the dtype) is object and not string. We will examine how to change this later in the book.

A common usage of a Series in pandas is to represent a time series that associates date/time index labels with values. The following demonstrates this by creating a date range using the pd.date_range() pandas function:

This has created a special index in pandas called DatetimeIndex, which is a specialized type of pandas index that is optimized to index data with dates and times.

Now let's create a Series using this index. The data values represent high temperatures on specific days:

This type of series with a DateTimeIndex is referred to as a time series.

We can look up a temperature on a specific data by using the date as a string:

Two Series objects can be applied to each other with an arithmetic operation. The following code creates a second Series and calculates the difference in temperature between the two:

The result of an arithmetic operation (+, -, /, *, ...) on two Series objects that are non-scalar values returns another Series object.

Since the index is not integer, we can also look up values by 0-based value:

Finally, pandas provides many descriptive statistical methods. As an example, the following returns the mean of the temperature differences:

主站蜘蛛池模板: 托里县| 祁连县| 汝阳县| 龙州县| 无极县| 望城县| 搜索| 会昌县| 神木县| 兴城市| 西林县| 四会市| 辛集市| 高青县| 高青县| 伊川县| 方城县| 庆阳市| 太谷县| 呼伦贝尔市| 中卫市| 铜陵市| 株洲市| 沂南县| 得荣县| 镇康县| 淅川县| 荣成市| 西充县| 文山县| 景洪市| 兴义市| 北海市| 北碚区| 万载县| 驻马店市| 洛扎县| 枣庄市| 房产| 湾仔区| 恩平市|