Use Pytorch The general process of implementing neural network model includes ：
1, Prepare the data
2, Defining models
3, Training models
4, Evaluation model
5, Using the model
6, Save the model .
For beginners , The most difficult part of this is actually the data preparation process .
The data types we usually encounter in practice include structured data , Picture data , Text data , time series data .
We will separate with titanic Survival prediction ,cifar2 Image classification problem ,imdb The classification of film reviews , For example, the prediction of the ending time of the new epidemic situation in China , Demo application Pytorch Modeling methods for these four types of data .
Pytorch It's based on Python Machine learning library . It's widely used in computer vision , Deep learning areas such as natural language processing . It's the present and TensorFlow A framework of deep learning in which people compete against each other , It's very popular in academic circles .
It mainly provides the following two core functions ：
1, Support GPU Accelerated tensor computation .
2, Automatic differentiation mechanism of convenient optimization model .
Pytorch Main advantages of ：
Simple and easy to understand ：Pytorch Of API The design is quite concise and consistent . Basically tensor, autograd, nn Three level packaging . It's very easy to learn . There's a piece like this , say TensorFlow The design philosophy of Make it complicated, Keras The design philosophy of Make it complicated and hide it, and Pytorch The design philosophy of Keep it simple and stupid.
Easy to debug ：Pytorch Using dynamic graphs , It can be like ordinary Python Debug the code as well . differ TensorFlow, Pytorch It is usually easy to understand . There's a piece like this , Say you'll never get from TensorFlow The reason why it went wrong was found in the error report description of .
Powerful and efficient ：Pytorch Provides a very rich set of model components , You can implement ideas quickly . And it runs very fast . At present, most of the deep learning is related to Paper It's all used Pytorch Realized . Some researchers say , From using TensorFlow Convert to use Pytorch after , They sleep better , The hair is thicker than before , The skin is smoother than before .
It is said that , Ten thousand Zhang tall buildings rise from the ground ,Pytorch This building also has its foundation .
Pytorch The core concept at the bottom is tensor , Dynamic computing graphs and automatic differentiation .
In this chapter we introduce Pytorch in 5 Different hierarchies ： The hardware layer , Kernel layer , The low order API, Middle stage API, Higher order API【torchkeras】. And linear regression and DNN Two classification model as an example , Intuitive comparison shows the characteristics of the implementation model at different levels .
Pytorch From low to high, the hierarchy can be divided into five levels .
The bottom layer is the hardware layer ,Pytorch Support CPU、GPU Join the pool of computing resources .
The second layer is C++ Implementation of the kernel .
The third layer is Python Implemented operators , Package provided C++ The low level of the kernel API Instructions , It mainly includes various tensor operators 、 Automatic differentiation 、 Variable management .
Such as torch.tensor,torch.cat,torch.autograd.grad,nn.Module.
If you compare a model to a house , So the third layer API Namely 【 The brick of the model 】.
The fourth floor is Python Implemented model components , To the lower class API Function encapsulation is carried out , It mainly includes various model layers , Loss function , Optimizer , Data pipes and so on .
Such as torch.nn.Linear,torch.nn.BCE,torch.optim.Adam,torch.utils.data.DataLoader.
If you compare a model to a house , So the fourth floor API Namely 【 The wall of the model 】.
The fifth floor is Python Implemented model interface .Pytorch There's no official high-level API. In order to train the model , The author imitates keras Model interface in , Less than 300 Line code , Encapsulates the Pytorch High order model interface of torchkeras.Model. If you compare a model to a house , So the fifth floor API It's the model itself , namely 【 Model house 】.
Pytorch The lower level of API It mainly includes tensor operation , Dynamic calculation graph and automatic differentiation .
If you compare a model to a house , So the lower order API Namely 【 The brick of the model 】.
At the lower level API On a level , You can put Pytorch As an enhanced version of numpy To use .
Pytorch The method provided is more than numpy More comprehensive , Faster computation , If necessary , You can also use GPU Accelerate .
In the previous chapters, we talked about the lower order API There has been a whole understanding of , In this chapter, we will focus on tensor operation and dynamic calculation diagram .
The operation of tensor mainly includes the structural operation of tensor and the mathematical operation of tensor .
Tensor structure operations such as ： Tensor creation , Index slice , Dimensional transformation , Merge and split .
Tensor mathematical operations mainly include ： Scalar operation , Vector operations , Matrix operations . In addition, we will introduce the broadcast mechanism of tensor operation .
We will mainly introduce the characteristics of dynamic calculation graph , Calculate... In the graph Function, Computational graphs and back propagation .
We will mainly introduce Pytorch The following is the middle order of API
The model layer
If you compare a model to a house , So the medium level API Namely 【 The wall of the model 】.
Pytorch There's no official high-level API. Usually by nn.Module To build a model and write a custom training cycle .
In order to train the model more conveniently , The author wrote a copy of keras Of Pytorch Model interface ：torchkeras, As Pytorch Higher order of API.
In this chapter, we mainly introduce in detail Pytorch Higher order of API The following related content .
Building models 3 Methods ( Inherit nn.Module Base class , Use nn.Sequential, Auxiliary application model container )
Training model 3 Methods ( Script style , Function style ,torchkeras.Model Genre style )
Use GPU Training models ( single GPU Training , many GPU Training )
import os import datetime # Print time def printbar(): nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') print("\n"+"=========="*8 + "%s"%nowtime) #mac On the system pytorch and matplotlib stay jupyter You need to change the environment variable when running in os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
titanic The goal of the dataset is to predict, based on passenger information, that they are in Titanic Whether or not it can survive after hitting an iceberg .
Structured data generally uses Pandas Medium DataFrame Pre treatment .
import numpy as np import pandas as pd import matplotlib.pyplot as plt import torch from torch import nn from torch.utils.data import Dataset,DataLoader,TensorDataset dftrain_raw = pd.read_csv('/home/kesci/input/data6936/data/titanic/train.csv') dftest_raw = pd.read_csv('/home/kesci/input/data6936/data/titanic/test.csv') dftrain_raw.head(10)
Field description ：
utilize Pandas We can easily carry out exploratory data analysis EDA（Exploratory Data Analysis）.
label Distribution situation
%matplotlib inline %config InlineBackend.figure_format = 'png' ax = dftrain_raw['Survived'].value_counts().plot(kind = 'bar', figsize = (12,8),fontsize=15,rot = 0) ax.set_ylabel('Counts',fontsize = 15) ax.set_xlabel('Survived',fontsize = 15) plt.show()
%matplotlib inline %config InlineBackend.figure_format = 'png' ax = dftrain_raw['Age'].plot(kind = 'hist',bins = 20,color= 'purple', figsize = (12,8),fontsize=15) ax.set_ylabel('Frequency',fontsize = 15) ax.set_xlabel('Age',fontsize = 15) plt.show()
The following is the formal data preprocessing
def preprocessing(dfdata): dfresult= pd.DataFrame() #Pclass dfPclass = pd.get_dummies(dfdata['Pclass']) dfPclass.columns = ['Pclass_' +str(x) for x in dfPclass.columns ] dfresult = pd.concat([dfresult,dfPclass],axis = 1) #Sex dfSex = pd.get_dummies(dfdata['Sex']) dfresult = pd.concat([dfresult,dfSex],axis = 1) #Age dfresult['Age'] = dfdata['Age'].fillna(0) dfresult['Age_null'] = pd.isna(dfdata['Age']).astype('int32') #SibSp,Parch,Fare dfresult['SibSp'] = dfdata['SibSp'] dfresult['Parch'] = dfdata['Parch'] dfresult['Fare'] = dfdata['Fare'] #Carbin dfresult['Cabin_null'] = pd.isna(dfdata['Cabin']).astype('int32') #Embarked dfEmbarked = pd.get_dummies(dfdata['Embarked'],dummy_na=True) dfEmbarked.columns = ['Embarked_' + str(x) for x in dfEmbarked.columns] dfresult = pd.concat([dfresult,dfEmbarked],axis = 1) return(dfresult) x_train = preprocessing(dftrain_raw).values y_train = dftrain_raw[['Survived']].values x_test = preprocessing(dftest_raw).values y_test = dftest_raw[['Survived']].values print("x_train.shape =", x_train.shape ) print("x_test.shape =", x_test.shape ) print("y_train.shape =", y_train.shape ) print("y_test.shape =", y_test.shape )
Use DataLoader and TensorDataset Data can be encapsulated into a pipeline .
dl_train = DataLoader(TensorDataset(torch.tensor(x_train).float(),torch.tensor(y_train).float()), shuffle = True, batch_size = 8) dl_valid = DataLoader(TensorDataset(torch.tensor(x_test).float(),torch.tensor(y_test).float()), shuffle = False, batch_size = 8)
# Test data pipeline for features,labels in dl_train: print(features,labels) break
Use Pytorch There are usually three ways to build models ： Use nn.Sequential Build models in a hierarchical order , Inherit nn.Module Base classes build custom models , Inherit nn.Module The base class builds the model and assists in encapsulating the model container .
Here choose the easiest to use nn.Sequential, Hierarchical order model .
def create_net(): net = nn.Sequential() net.add_module("linear1",nn.Linear(15,20)) net.add_module("relu1",nn.ReLU()) net.add_module("linear2",nn.Linear(20,15)) net.add_module("relu2",nn.ReLU()) net.add_module("linear3",nn.Linear(15,1)) net.add_module("sigmoid",nn.Sigmoid()) return net net = create_net() print(net)
!pip install torchkeras !pip install prettytable !pip install datetime
from torchkeras import summary summary(net,input_shape=(15,))
Pytorch It usually requires the user to write a custom training cycle , The code style of the training cycle varies from person to person .
Yes 3 Class typical training cycle code style ： Script form training cycle , Function form training cycle , Class form training cycle .
Here is a more general script form .
from sklearn.metrics import accuracy_score loss_func = nn.BCELoss() optimizer = torch.optim.Adam(params=net.parameters(),lr = 0.01) metric_func = lambda y_pred,y_true: accuracy_score(y_true.data.numpy(),y_pred.data.numpy()>0.5) metric_name = "accuracy"
epochs = 10 log_step_freq = 30 dfhistory = pd.DataFrame(columns = ["epoch","loss",metric_name,"val_loss","val_"+metric_name]) print("Start Training...") nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') print("=========="*8 + "%s"%nowtime) for epoch in range(1,epochs+1): # 1, Training cycle ------------------------------------------------- net.train() loss_sum = 0.0 metric_sum = 0.0 step = 1 for step, (features,labels) in enumerate(dl_train, 1): # Gradient clear optimizer.zero_grad() # Forward propagation for loss predictions = net(features) loss = loss_func(predictions,labels) metric = metric_func(predictions,labels) # Back propagation gradient loss.backward() optimizer.step() # Print batch The level of log loss_sum += loss.item() metric_sum += metric.item() if step%log_step_freq == 0: print(("[step = %d] loss: %.3f, "+metric_name+": %.3f") % (step, loss_sum/step, metric_sum/step)) # 2, Verification cycle ------------------------------------------------- net.eval() val_loss_sum = 0.0 val_metric_sum = 0.0 val_step = 1 for val_step, (features,labels) in enumerate(dl_valid, 1): # Turn off gradient computation with torch.no_grad(): predictions = net(features) val_loss = loss_func(predictions,labels) val_metric = metric_func(predictions,labels) val_loss_sum += val_loss.item() val_metric_sum += val_metric.item() # 3, Log ------------------------------------------------- info = (epoch, loss_sum/step, metric_sum/step, val_loss_sum/val_step, val_metric_sum/val_step) dfhistory.loc[epoch-1] = info # Print epoch The level of log print(("\nEPOCH = %d, loss = %.3f,"+ metric_name + \ " = %.3f, val_loss = %.3f, "+"val_"+ metric_name+" = %.3f") %info) nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') print("\n"+"=========="*8 + "%s"%nowtime) print('Finished Training...')
Let's first evaluate the effect of the model on the training set and validation set
%matplotlib inline %config InlineBackend.figure_format = 'svg' import matplotlib.pyplot as plt def plot_metric(dfhistory, metric): train_metrics = dfhistory[metric] val_metrics = dfhistory['val_'+metric] epochs = range(1, len(train_metrics) + 1) plt.plot(epochs, train_metrics, 'bo--') plt.plot(epochs, val_metrics, 'ro-') plt.title('Training and validation '+ metric) plt.xlabel("Epochs") plt.ylabel(metric) plt.legend(["train_"+metric, 'val_'+metric]) plt.show() plot_metric(dfhistory,"loss")
5、 ... and , Using the model
# Prediction probability y_pred_probs = net(torch.tensor(x_test[0:10]).float()).data y_pred_probs
# Forecast category y_pred = torch.where(y_pred_probs>0.5, torch.ones_like(y_pred_probs),torch.zeros_like(y_pred_probs)) y_pred
Pytorch There are two ways to save models , All by calling pickle The serialization method implements .
The first method only saves model parameters .
The second way to save the entire model .
The first one is recommended , The second method may cause various problems when switching devices and directories .
1, Save model parameters ( recommend )
torch.save(net.state_dict(), "./data/net_parameter.pkl") net_clone = create_net() net_clone.load_state_dict(torch.load("./data/net_parameter.pkl")) net_clone.forward(torch.tensor(x_test[0:10]).float()).data
2, Save the complete model ( Not recommended )
torch.save(net, './data/net_model.pkl') net_loaded = torch.load('./data/net_model.pkl') net_loaded(torch.tensor(x_test[0:10]).float()).data