Pandas binary operation

2021-04-07
pandas binary operation

Pandas Binary operations between data structures , Pay attention to the following two key points :

  • Multidimensional (DataFrame) And low dimension (Series) The broadcast mechanism between objects
  • Missing value processing in calculation

These two problems can be solved at the same time , But let's introduce how to deal with it separately .

matching / Broadcast mechanism

DataFrame Support add()、sub()、mul()、div()、radd()、rsub() And so on . The broadcast mechanism focuses on the input Series, adopt axis keyword , matching index or columns You can call these functions .

You can also use Series Multi level index DataFrame A certain level of .

Series And Index And support divmod() Built in functions , This function performs both integer division down and modular operations , Returns two tuples of the same type as the left .

divmod It also supports element level operations :

Missing values and fill in missing values

Series And DataFrame The arithmetic function of supports fill_value Options , That is to replace the missing value of a position with the specified value . such as , Two DataFrame Add up , Except for two DataFrame There are missing values at the same position in the , The sum of them is still NaN, If only one DataFrame There is a missing value in , Available fill_value Specify a value to replace NaN, Of course , It can also be used. fillna hold NaN Replace with the desired value .

Comparison operation

Series And DataFrame And support eq、ne、lt、gt、le、ge And so on :

These operations generate a... Of the same type as the input object on the left Pandas object , namely dtype by bool.boolean Object can be used for index operations .

Boolean simplification

empty、any、all、bool Data aggregation can be reduced to a single Boolean value .

You can further simplify the above result to a single Boolean value .

adopt empty attribute , Can verify Pandas Whether the object is empty .

use bool Method validation single element Pandas Boolean value of the object

Compare whether the objects are equivalent

In general , There are many ways to get the same structure . With df+df And df*2 For example . Test whether the structure of the two calculation methods is consistent , Most people can use (df+df==df*2).all(), But the result of this expression is False.

Boolean type DataFrame df+df==df*2 There is False The value is because of two NaN The comparison of the values is unequal :

To verify that the data is equivalent ,Series And DataFrame etc. N Dimension framework provides equals() Method , Use this method to verify NaN The result of the value is equal .

But here we have to pay attention to Series And DataFrame The order of indexes must be consistent , The verification result can be True.

Compare array Type object

Use scalar values and Pandas The data structure is very simple compared to the data elements :

Pandas You can also compare two equal lengths array Data elements in objects :

It's not equal Index or Series Object will trigger valueError:

But the operation here is related to NumPy The broadcasting mechanism is different :

NumPy When the broadcast operation cannot be performed , return False:

