官术网_书友最值得收藏!

Alignment via index labels

Alignment of Series data by index labels is a fundamental concept in pandas, as well as being one of its most powerful concepts. Alignment provides automatic correlation of related values in multiple Series objects based upon index labels. This saves a lot of error-prone effort matching data in multiple sets using standard procedural techniques.

To demonstrate alignment, let's perform an example of adding values in two Series objects. Let's start with the following two Series objects representing two different samples of a set of variables (a and b):

Now suppose we would like to total the values for each variable. We can express this simply as s1 + s2:

pandas has matched the measurement for each variable in each series, added those values, and returned us the sum for each in one succinct statement.

It is also possible to apply a scalar value to a Series. The result will be that the scalar will be applied to each value in the Series using the specified operation:

Remember earlier when it was stated that we would come back to creating a Series with a scalar value? When performing this type of operation, pandas actually performs the following actions:

The first step is the creation of a Series from the scalar value, but with the index of the target Series. The multiplication is then applied to the aligned values of the two Series objects, which perfectly align because the index is identical.

The labels in the indexes are not required to align. Where alignment does not occur, pandas will return NaN as the result:

The NaN value is, by default, the result of any pandas alignment where an index label does not align with the other Series. This is an important characteristic of pandas, when compared to NumPy. If labels do not align, there should not be an exception thrown. This helps when some data is missing but it is acceptable for this to happen. Processing continues, but pandas lets you know there's an issue (but not necessarily a problem) by returning NaN.

Labels in a pandas index do not need to be unique. The alignment operation actually forms a Cartesian product of the labels in the two Series. If there are n 'a' labels in series 1, and m labels in series 2, then the result will have n*m total rows in the result.

To demonstrate this let's use the following two Series objects:

This will result in 6 'a' index labels and NaN for 'b' and 'c':

主站蜘蛛池模板: 东乌珠穆沁旗| 泾阳县| 阳朔县| 巴楚县| 延庆县| 贡山| 青河县| 日土县| 车险| 洛南县| 莱阳市| 彭泽县| 两当县| 嵩明县| 汝阳县| 恭城| 淮阳县| 长子县| 曲阜市| 买车| 竹山县| 那曲县| 南昌县| 平利县| 新绛县| 赣榆县| 巴中市| 武宁县| 自贡市| 贵德县| 九寨沟县| 公主岭市| 高州市| 凌海市| 琼海市| 毕节市| 体育| 海宁市| 汉源县| 永登县| 桑植县|