The training process of deep learning is often time-consuming , Training a model for a few hours is a common practice , Training for a few days is also a common thing , Sometimes it takes dozens of days to train .
The training process is time-consuming mainly from two parts , Part of it comes from data preparation , The other part comes from parameter iteration .
When the data preparation process is still the main bottleneck of model training time , We can use more processes to prepare data .
When the parameter iteration process becomes the main bottleneck of training time , Our usual method is to apply GPU To speed up .
Pytorch Use in GPU The acceleration model is very simple , Just move the model and data to GPU On . The core code is just the following lines .
# Defining models ... device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") model.to(device) # Move model to cuda # Training models ... features = features.to(device) # Move data to cuda labels = labels.to(device) # perhaps labels = labels.cuda() if torch.cuda.is_available() else labels ...
If you want to use more than one GPU Training models , It's very simple . Just set the model to the data parallel style model .
Then the model moves to GPU After that , Will be in every GPU Make a copy of , And divide the data equally GPU Training on . The core code is as follows .