When training a visual related neural network model , It's always about reading and writing images . There are many ways , such as matplotlib、cv2、PIL etc. . Here are some ways to read and write , To choose the fastest way , Speed up your training .
Because the training framework is Pytorch, So the experimental criteria for reading are as follows :
1、 The reading resolution is 1920x1080 Of 5 Pictures (png Format a piece of ,jpg There are four forms ) And store it in the array .
2、 Convert the read array into dimensions in the order of CxHxW Of Pytorch Tensors , And stored in video memory ( I use GPU Training ), The order of the three channels is RGB.
3、 Record the time spent by each method in the above operations . Because png The size of the image in the format is almost the same as the quality jpg Format 10 times , So datasets don't usually use png To store , Do not compare the reading time difference between the two formats .
The experimental criteria are as follows :
1、 Will 5 Zhang 1920x1080 Of 5 The image corresponds to Pytorch The tensor is converted to an array of data types that can be used by the corresponding method .
2、 With jpg Format to store five pictures .
3、 Record the time taken by each method to store images .
Because of GPU, therefore cv2 There are two ways to read pictures :
1、 First read all the pictures as one numpy Array , And then convert it to be stored in GPU Medium pytorch Tensors .
2、 Initialize a stored in GPU Medium pytorch Tensors , And then copy each graph directly into this tensor .
The first way to experiment is as follows :
import os, torch import cv2 as cv import numpy as np from time import time read_path = 'D:test' write_path = 'D:test\\write\\' # cv2 Read 1 start_t = time() imgs = np.zeros([5, 1080, 1920, 3]) for img, i in zip(os.listdir(read_path), range(5)): img = cv.imread(filename=os.path.join(read_path, img)) imgs[i] = img imgs = torch.tensor(imgs).to('cuda')[...,[2,1,0]].permute([0,3,1,2])/255 print('cv2 Read time 1:', time() - start_t) # cv2 Store start_t = time() imgs = (imgs.permute([0,2,3,1])[...,[2,1,0]]*255).cpu().numpy() for i in range(imgs.shape[0]): cv.imwrite(write_path + str(i) + '.jpg', imgs[i]) print('cv2 Storage time :', time() - start_t)
The results of the experiment :
cv2 Read time 1: 0.39693760871887207 cv2 Storage time : 0.3560612201690674
The second way to experiment is as follows :
import os, torch import cv2 as cv import numpy as np from time import time read_path = 'D:test' write_path = 'D:test\\write\\' # cv2 Read 2 start_t = time() imgs = torch.zeros([5, 1080, 1920, 3], device='cuda') for img, i in zip(os.listdir(read_path), range(5)): img = torch.tensor(cv.imread(filename=os.path.join(read_path, img)), device='cuda') imgs[i] = img imgs = imgs[...,[2,1,0]].permute([0,3,1,2])/255 print('cv2 Read time 2:', time() - start_t) # cv2 Store start_t = time() imgs = (imgs.permute([0,2,3,1])[...,[2,1,0]]*255).cpu().numpy() for i in range(imgs.shape[0]): cv.imwrite(write_path + str(i) + '.jpg', imgs[i]) print('cv2 Storage time :', time() - start_t)
The results of the experiment :
cv2 Read time 2: 0.23636841773986816 cv2 Storage time : 0.3066873550415039
There are two ways to read , The first code is as follows :
import os, torch import numpy as np import matplotlib.pyplot as plt from time import time read_path = 'D:test' write_path = 'D:test\\write\\' # matplotlib Read 1 start_t = time() imgs = np.zeros([5, 1080, 1920, 3]) for img, i in zip(os.listdir(read_path), range(5)): img = plt.imread(os.path.join(read_path, img)) imgs[i] = img imgs = torch.tensor(imgs).to('cuda').permute([0,3,1,2])/255 print('matplotlib Read time 1:', time() - start_t) # matplotlib Store start_t = time() imgs = (imgs.permute([0,2,3,1])).cpu().numpy() for i in range(imgs.shape[0]): plt.imsave(write_path + str(i) + '.jpg', imgs[i]) print('matplotlib Storage time :', time() - start_t)
The results of the experiment :
matplotlib Read time 1: 0.45380306243896484 matplotlib Storage time : 0.768944263458252
The second way to experiment with code :
import os, torch import numpy as np import matplotlib.pyplot as plt from time import time read_path = 'D:test' write_path = 'D:test\\write\\' # matplotlib Read 2 start_t = time() imgs = torch.zeros([5, 1080, 1920, 3], device='cuda') for img, i in zip(os.listdir(read_path), range(5)): img = torch.tensor(plt.imread(os.path.join(read_path, img)), device='cuda') imgs[i] = img imgs = imgs.permute([0,3,1,2])/255 print('matplotlib Read time 2:', time() - start_t) # matplotlib Store start_t = time() imgs = (imgs.permute([0,2,3,1])).cpu().numpy() for i in range(imgs.shape[0]): plt.imsave(write_path + str(i) + '.jpg', imgs[i]) print('matplotlib Storage time :', time() - start_t)
The results of the experiment :
matplotlib Read time 2: 0.2044532299041748 matplotlib Storage time : 0.4737534523010254
It should be noted that ,matplotlib Read png Format image gets the array value in $[0, 1]$ Floating point numbers in range , and jpg The format picture is in $[0, 255]$ Integers in range . So if the format of the images in the dataset is inconsistent , Be careful to convert to consistent before reading , Otherwise, the data set preprocessing will be troublesome .
PIL Can't be used directly pytorch Tensor or numpy Array , You have to convert to Image Type , So it's troublesome , Time complexity must have been the underdog , No more experiments .
torchvision Provides direct access from pytorch The function of tensor to store pictures , And the fastest read above matplotlib The method of combining , The code is as follows :
import os, torch import matplotlib.pyplot as plt from time import time from torchvision import utils read_path = 'D:test' write_path = 'D:test\\write\\' # matplotlib Read 2 start_t = time() imgs = torch.zeros([5, 1080, 1920, 3], device='cuda') for img, i in zip(os.listdir(read_path), range(5)): img = torch.tensor(plt.imread(os.path.join(read_path, img)), device='cuda') imgs[i] = img imgs = imgs.permute([0,3,1,2])/255 print('matplotlib Read time 2:', time() - start_t) # torchvision Store start_t = time() for i in range(imgs.shape[0]): utils.save_image(imgs[i], write_path + str(i) + '.jpg') print('torchvision Storage time :', time() - start_t)
The results of the experiment :
matplotlib Read time 2: 0.15358829498291016 torchvision Storage time : 0.14760661125183105
You can see that these two are the fastest ways to read and write . in addition , Try to make the reading and writing of pictures not affect the training program , We can also run these two processes in parallel with training . in addition ,utils.save_image Multiple images can be spliced into one to store , The specific use method is as follows :
utils.save_image(tensor = imgs, # Multiple picture tensors to store shape = [n, C, H, W] fp = 'test.jpg', # Storage path nrow = 5, # When multiple graphs are spliced , Number of pictures per line padding = 1, # When multiple graphs are spliced , The spacing between each graph normalize = True, # Whether to standardize , Usually used to output images tanh, So we need to standardize range = (-1,1)) # The scope of normalization
&n