- Learning pandas(Second Edition)
- Michael Heydt
- 422字
- 2021-07-02 20:37:12
Alignment via index labels
Alignment of Series data by index labels is a fundamental concept in pandas, as well as being one of its most powerful concepts. Alignment provides automatic correlation of related values in multiple Series objects based upon index labels. This saves a lot of error-prone effort matching data in multiple sets using standard procedural techniques.
To demonstrate alignment, let's perform an example of adding values in two Series objects. Let's start with the following two Series objects representing two different samples of a set of variables (a and b):


Now suppose we would like to total the values for each variable. We can express this simply as s1 + s2:

pandas has matched the measurement for each variable in each series, added those values, and returned us the sum for each in one succinct statement.
It is also possible to apply a scalar value to a Series. The result will be that the scalar will be applied to each value in the Series using the specified operation:

Remember earlier when it was stated that we would come back to creating a Series with a scalar value? When performing this type of operation, pandas actually performs the following actions:


The first step is the creation of a Series from the scalar value, but with the index of the target Series. The multiplication is then applied to the aligned values of the two Series objects, which perfectly align because the index is identical.
The labels in the indexes are not required to align. Where alignment does not occur, pandas will return NaN as the result:


The NaN value is, by default, the result of any pandas alignment where an index label does not align with the other Series. This is an important characteristic of pandas, when compared to NumPy. If labels do not align, there should not be an exception thrown. This helps when some data is missing but it is acceptable for this to happen. Processing continues, but pandas lets you know there's an issue (but not necessarily a problem) by returning NaN.
Labels in a pandas index do not need to be unique. The alignment operation actually forms a Cartesian product of the labels in the two Series. If there are n 'a' labels in series 1, and m labels in series 2, then the result will have n*m total rows in the result.
To demonstrate this let's use the following two Series objects:


This will result in 6 'a' index labels and NaN for 'b' and 'c':

- TypeScript入門與實戰
- 深入理解Django:框架內幕與實現原理
- Python 深度學習
- 人人都是網站分析師:從分析師的視角理解網站和解讀數據
- Scratch趣味編程:陪孩子像搭積木一樣學編程
- Zabbix Performance Tuning
- HTML5+CSS3+jQuery Mobile APP與移動網站設計從入門到精通
- Clojure High Performance Programming(Second Edition)
- XML程序設計(第二版)
- C/C++代碼調試的藝術(第2版)
- Learn Linux Quickly
- C語言程序設計實驗指導與習題精解
- C語言程序設計
- HTML5/CSS3/JavaScript技術大全
- Python實戰指南:手把手教你掌握300個精彩案例