## [advanced Python] do you really understand ndarray in numpy?

There are three words 2021-08-09 15:46:50

If you want to master Python, that NumPy It's something you have to master .NumPy It's actually Python An extended library of language , Support high-dimensional array and matrix operations , It provides a large number of mathematical function libraries .

1 ndarray Memory mechanism

We know NumPy One of the most important features is its N Dimensional array object ndarray. Usually ndarray The interior consists of .

Data pointer ： A pointer to the actual data ;

data type (dtype)： Describes the number of bytes occupied by each element in the array ;

dimension (shape): An array shape ( The size of each dimension ) tuples .

span (strides)： A represents the number of bytes to cross from the current element to the next element .

Let's look at the following code ndarray The content of ：

import numpy as np

a = np.arange(1,25).reshape((2,2,2,3))

print(type(a))

print(a.shape)

print(a.dtype)

print(a.strides) there shape Refers to each dimension Number of elements . The number of elements in each dimension of the four-dimensional array here is 2、2、2 and 3. Why , This seems wrong ！2+2+2+3=9, This does not mean 24 ah ！ Is my analysis wrong ？ Of course not. , Let's start with a Output , give the result as follows ： The innermost floor here [ ] Can represent 1 One dimensional array , There are 3 Elements . A pair of bold red [ ] Represents a two-dimensional array , There are two in it [ ], That is, there are... In the two-dimensional array 2 One dimensional array . Therefore, the number of elements in the third dimension is 2; A pair of bold purple [ ] Represents a three-dimensional array , There are two in it [ ], That is, there are... In the three-dimensional array 2 Two dimensional array . Therefore, the number of elements in the second dimension is 2. Empathy , The number of elements in the first dimension is 2. I believe you should understand by now shape Why are tuples in (2,2,2,3) Is that right .

after one's words shape, Let's see dtype, It refers to the array element type , Be careful , The array tuple here refers to 1,2,...,23,24 Such an element . In addition, the type of each element in the array is the same , In this array , Each element in the array is of type int32.

Finally, let's analyze the span (strides). It refers to the number of bytes to cross from the current element to the next element .

We can see in the above example strides=(48,24,12,4). So how did these four numbers come from ？

We're in the four-dimensional array above ,dtype by int, and int Occupy 4 Bytes . And the fourth dimension has 3 Elements , The total number of bytes is 12, Therefore, the number of bytes to be crossed from the fourth dimension to the third dimension is 12; The third dimension has 2 Elements ( One dimensional array ), The total number of bytes of each one-dimensional array is 12, Therefore, the number of bytes to cross from the third dimension to the second dimension is 24. Empathy , The number of bytes spanning from the second dimension to the first dimension is 48.

So the span of the four-dimensional array in the above example is (48,24,12,4), Its representation in memory is shown in the following figure ： It can be clearly seen from the above figure that in order to reach the third dimension ( Axis 2), You have to cross the fourth dimension ( Axis 3), Need to cross 3 Elements , The number of bytes is 12; To reach the first dimension ( Axis 0), You have to cross the second 、 3、 ... and 、 Four dimensions , in total 12 Elements , The number of bytes is 12*4=48.

That is to say NumPy How data is stored in . It is stored in a uniform continuous memory block , It's understandable ,NumPy Store multidimensional array in the form of one-dimensional array , We just need to know the number of bytes per element （dtype） And the number of elements in each dimension （shape）, You can quickly locate any element of any dimension .

2 NumPy High dimensional array index and transpose

2.1 Indexes

When it comes to indexes , You may think it's easy , Don't you just get an element through an index ？ The truth is that . But in the face of high-dimensional arrays , It is troublesome to get an element by index .

Let's analyze the index of the next four-dimensional array through a case . If I want to get the above 17 This element , What should I do ？ First, the four-dimensional array is represented in the form of the axis in the figure above . We You can think of it as four blocks first , Among them the first 0 Shaft and section 1 The axis determines the position of a block , The first 2 Shaft and section 3 The axis determines the specific position of an element in the block .

In the picture 17 In the 3 block , As shown in the yellow part below , use 0 Axis and 1 In terms of axes , The index is [1,0]. Now the position of the block is determined , Next, we determine the location of the elements in the block . As shown in the figure below ： 17 The index of this element in the figure above is [1,1]. Next, we just need to determine the index of the block [1,0] And determine the index of the elements in the block [] according to [ The first 0 Axis , The first 1 Axis , The first 2 Axis , The first 3 Axis ] This format can be merged , In this case , After the merger 17 The index of is [1,0,1,1].

Of course , You can use the following code to confirm , Is our analysis correct .

import numpy as np

a = np.arange(1,25).reshape((2,2,2,3))

print(a[1,0,1,1])

This is the index of high-dimensional array , Have you understood ？

2.2 High dimensional array transpose

Transposition of high-dimensional arrays has always been learning NumPy One of the difficulties of , Although in NumPy You only need to call numpy.transpose You can complete the transpose operation , But can you really analyze why the result is like this ？ Especially high-dimensional arrays . Let's take a simple example to analyze . as follows ： The above figure shows the original array , We transpose through the following code , We will get the following results ：

import numpy as np
a = np.arange(16).reshape((2,2,4))

b = a.transpose((1,0,2))
print(b)

The result of transposition ： I believe you have seen the specific difference , That is the exchange of the index order of the axes . Because in the code we require 0 Axis and 1 The shafts are interchangeable , So the result of transposition is actually a[1,0] Will become the original array a[0,1];a[0,1] Will become the original array a[1,0]. If represented by a graph , As shown in the figure below ： I believe you have understood the principle , Next, leave a question to think about , as follows ： Excuse me, , How to transpose from left to right to get ！

summary

In this issue, we introduce ndarray Memory mechanism and index and transpose of high-dimensional array .NumPy There is still a lot of knowledge about , It's just NumPy Some difficult problems in , If you want to learn more systematically NumPy And know the analysis process and answers of the above thinking questions , Please move to our knowledge planet ！

Next up ：Python library Scipy Advanced applications for

There are three AI Programming and open source framework knowledge There are three AI Programming and open source framework knowledge planet created , Welcome to join , I hope you can use this platform , Have a solid programming foundation . https://pythonmana.com/2021/08/20210809154027111t.html