Accelerated Python learning - 1 day

The sky is full of stars_ 2020-11-13 00:42:08
accelerated python learning day


One 、Pytorch The modeling process of

Use Pytorch The general process of implementing neural network model includes :

1, Prepare the data

2, Defining models

3, Training models

4, Evaluation model

5, Using the model

6, Save the model .

For beginners , The most difficult part of this is actually the data preparation process .

The data types we usually encounter in practice include structured data , Picture data , Text data , time series data .

We will separate with titanic Survival prediction ,cifar2 Image classification problem ,imdb The classification of film reviews , For example, the prediction of the ending time of the new epidemic situation in China , Demo application Pytorch Modeling methods for these four types of data .

Two 、Pytorch Core concept of

Pytorch It's based on Python Machine learning library . It's widely used in computer vision , Deep learning areas such as natural language processing . It's the present and TensorFlow A framework of deep learning in which people compete against each other , It's very popular in academic circles .

It mainly provides the following two core functions :

1, Support GPU Accelerated tensor computation .

2, Automatic differentiation mechanism of convenient optimization model .

Pytorch Main advantages of :

  • Simple and easy to understand :Pytorch Of API The design is quite concise and consistent . Basically tensor, autograd, nn Three level packaging . It's very easy to learn . There's a piece like this , say TensorFlow The design philosophy of Make it complicated, Keras The design philosophy of Make it complicated and hide it, and Pytorch The design philosophy of Keep it simple and stupid.

  • Easy to debug :Pytorch Using dynamic graphs , It can be like ordinary Python Debug the code as well . differ TensorFlow, Pytorch It is usually easy to understand . There's a piece like this , Say you'll never get from TensorFlow The reason why it went wrong was found in the error report description of .

  • Powerful and efficient :Pytorch Provides a very rich set of model components , You can implement ideas quickly . And it runs very fast . At present, most of the deep learning is related to Paper It's all used Pytorch Realized . Some researchers say , From using TensorFlow Convert to use Pytorch after , They sleep better , The hair is thicker than before , The skin is smoother than before .

It is said that , Ten thousand Zhang tall buildings rise from the ground ,Pytorch This building also has its foundation .

Pytorch The core concept at the bottom is tensor , Dynamic computing graphs and automatic differentiation .

3、 ... and 、Pytorch Hierarchical structure

In this chapter we introduce Pytorch in 5 Different hierarchies : The hardware layer , Kernel layer , The low order API, Middle stage API, Higher order API【torchkeras】. And linear regression and DNN Two classification model as an example , Intuitive comparison shows the characteristics of the implementation model at different levels .

Pytorch From low to high, the hierarchy can be divided into five levels .

The bottom layer is the hardware layer ,Pytorch Support CPU、GPU Join the pool of computing resources .

The second layer is C++ Implementation of the kernel .

The third layer is Python Implemented operators , Package provided C++ The low level of the kernel API Instructions , It mainly includes various tensor operators 、 Automatic differentiation 、 Variable management .
Such as torch.tensor,torch.cat,torch.autograd.grad,nn.Module.
If you compare a model to a house , So the third layer API Namely 【 The brick of the model 】.

The fourth floor is Python Implemented model components , To the lower class API Function encapsulation is carried out , It mainly includes various model layers , Loss function , Optimizer , Data pipes and so on .
Such as torch.nn.Linear,torch.nn.BCE,torch.optim.Adam,torch.utils.data.DataLoader.
If you compare a model to a house , So the fourth floor API Namely 【 The wall of the model 】.

The fifth floor is Python Implemented model interface .Pytorch There's no official high-level API. In order to train the model , The author imitates keras Model interface in , Less than 300 Line code , Encapsulates the Pytorch High order model interface of torchkeras.Model. If you compare a model to a house , So the fifth floor API It's the model itself , namely 【 Model house 】.

Four 、Pytorch The lower level of API

Pytorch The lower level of API It mainly includes tensor operation , Dynamic calculation graph and automatic differentiation .

If you compare a model to a house , So the lower order API Namely 【 The brick of the model 】.

At the lower level API On a level , You can put Pytorch As an enhanced version of numpy To use .

Pytorch The method provided is more than numpy More comprehensive , Faster computation , If necessary , You can also use GPU Accelerate .

In the previous chapters, we talked about the lower order API There has been a whole understanding of , In this chapter, we will focus on tensor operation and dynamic calculation diagram .

The operation of tensor mainly includes the structural operation of tensor and the mathematical operation of tensor .

Tensor structure operations such as : Tensor creation , Index slice , Dimensional transformation , Merge and split .

Tensor mathematical operations mainly include : Scalar operation , Vector operations , Matrix operations . In addition, we will introduce the broadcast mechanism of tensor operation .

We will mainly introduce the characteristics of dynamic calculation graph , Calculate... In the graph Function, Computational graphs and back propagation .

5、 ... and 、Pytorch Medium order of API

We will mainly introduce Pytorch The following is the middle order of API

  • Data pipeline

  • The model layer

  • Loss function

  • TensorBoard visualization

If you compare a model to a house , So the medium level API Namely 【 The wall of the model 】.

6、 ... and 、Pytorch Higher order of API

Pytorch There's no official high-level API. Usually by nn.Module To build a model and write a custom training cycle .

In order to train the model more conveniently , The author wrote a copy of keras Of Pytorch Model interface :torchkeras, As Pytorch Higher order of API.

In this chapter, we mainly introduce in detail Pytorch Higher order of API The following related content .

  • Building models 3 Methods ( Inherit nn.Module Base class , Use nn.Sequential, Auxiliary application model container )

  • Training model 3 Methods ( Script style , Function style ,torchkeras.Model Genre style )

  • Use GPU Training models ( single GPU Training , many GPU Training )

1-1, An example of a structured data modeling process

import os
import datetime
# Print time
def printbar():
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
#mac On the system pytorch and matplotlib stay jupyter You need to change the environment variable when running in
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE" 

One , Prepare the data

titanic The goal of the dataset is to predict, based on passenger information, that they are in Titanic Whether or not it can survive after hitting an iceberg .

Structured data generally uses Pandas Medium DataFrame Pre treatment .

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
dftrain_raw = pd.read_csv('/home/kesci/input/data6936/data/titanic/train.csv')
dftest_raw = pd.read_csv('/home/kesci/input/data6936/data/titanic/test.csv')
dftrain_raw.head(10)

Field description :

  • Survived:0 For death ,1 For survival 【y label 】
  • Pclass: The type of ticket held by passengers , There are three values (1,2,3) 【 convert to onehot code 】
  • Name: Name of passenger 【 Give up 】
  • Sex: Passenger gender 【 convert to bool features 】
  • Age: Passenger age ( There is a lack of ) 【 Numerical characteristics , add to “ Is age missing ” As an auxiliary feature 】
  • SibSp: Passenger brothers and sisters / The number of spouses ( An integer value ) 【 Numerical characteristics 】
  • Parch: Passenger's parents / The number of children ( An integer value )【 Numerical characteristics 】
  • Ticket: Ticket number ( character string )【 Give up 】
  • Fare: The price of a passenger ticket ( Floating point numbers ,0-500 Unequal ) 【 Numerical characteristics 】
  • Cabin: Passenger cabin ( There is a lack of ) 【 add to “ Is the cabin missing ” As an auxiliary feature 】
  • Embarked: Passenger boarding port :S、C、Q( There is a lack of )【 convert to onehot code , Four dimensions S,C,Q,nan】

utilize Pandas We can easily carry out exploratory data analysis EDA(Exploratory Data Analysis).

label Distribution situation

%matplotlib inline
%config InlineBackend.figure_format = 'png'
ax = dftrain_raw['Survived'].value_counts().plot(kind = 'bar',
figsize = (12,8),fontsize=15,rot = 0)
ax.set_ylabel('Counts',fontsize = 15)
ax.set_xlabel('Survived',fontsize = 15)
plt.show()

 

%matplotlib inline
%config InlineBackend.figure_format = 'png'
ax = dftrain_raw['Age'].plot(kind = 'hist',bins = 20,color= 'purple',
figsize = (12,8),fontsize=15)
ax.set_ylabel('Frequency',fontsize = 15)
ax.set_xlabel('Age',fontsize = 15)
plt.show()

 

The following is the formal data preprocessing

def preprocessing(dfdata):
dfresult= pd.DataFrame()
#Pclass
dfPclass = pd.get_dummies(dfdata['Pclass'])
dfPclass.columns = ['Pclass_' +str(x) for x in dfPclass.columns ]
dfresult = pd.concat([dfresult,dfPclass],axis = 1)
#Sex
dfSex = pd.get_dummies(dfdata['Sex'])
dfresult = pd.concat([dfresult,dfSex],axis = 1)
#Age
dfresult['Age'] = dfdata['Age'].fillna(0)
dfresult['Age_null'] = pd.isna(dfdata['Age']).astype('int32')
#SibSp,Parch,Fare
dfresult['SibSp'] = dfdata['SibSp']
dfresult['Parch'] = dfdata['Parch']
dfresult['Fare'] = dfdata['Fare']
#Carbin
dfresult['Cabin_null'] = pd.isna(dfdata['Cabin']).astype('int32')
#Embarked
dfEmbarked = pd.get_dummies(dfdata['Embarked'],dummy_na=True)
dfEmbarked.columns = ['Embarked_' + str(x) for x in dfEmbarked.columns]
dfresult = pd.concat([dfresult,dfEmbarked],axis = 1)
return(dfresult)
x_train = preprocessing(dftrain_raw).values
y_train = dftrain_raw[['Survived']].values
x_test = preprocessing(dftest_raw).values
y_test = dftest_raw[['Survived']].values
print("x_train.shape =", x_train.shape )
print("x_test.shape =", x_test.shape )
print("y_train.shape =", y_train.shape )
print("y_test.shape =", y_test.shape )

  Use DataLoader and TensorDataset Data can be encapsulated into a pipeline .

dl_train = DataLoader(TensorDataset(torch.tensor(x_train).float(),torch.tensor(y_train).float()),
shuffle = True, batch_size = 8)
dl_valid = DataLoader(TensorDataset(torch.tensor(x_test).float(),torch.tensor(y_test).float()),
shuffle = False, batch_size = 8)
# Test data pipeline
for features,labels in dl_train:
print(features,labels)
break

Two , Defining models

Use Pytorch There are usually three ways to build models : Use nn.Sequential Build models in a hierarchical order , Inherit nn.Module Base classes build custom models , Inherit nn.Module The base class builds the model and assists in encapsulating the model container .

Here choose the easiest to use nn.Sequential, Hierarchical order model .

def create_net():
net = nn.Sequential()
net.add_module("linear1",nn.Linear(15,20))
net.add_module("relu1",nn.ReLU())
net.add_module("linear2",nn.Linear(20,15))
net.add_module("relu2",nn.ReLU())
net.add_module("linear3",nn.Linear(15,1))
net.add_module("sigmoid",nn.Sigmoid())
return net
net = create_net()
print(net)

 

!pip install torchkeras
!pip install prettytable
!pip install datetime

 

from torchkeras import summary
summary(net,input_shape=(15,))
 

3、 ... and , Training models

Pytorch It usually requires the user to write a custom training cycle , The code style of the training cycle varies from person to person .

Yes 3 Class typical training cycle code style : Script form training cycle , Function form training cycle , Class form training cycle .

Here is a more general script form .

from sklearn.metrics import accuracy_score
loss_func = nn.BCELoss()
optimizer = torch.optim.Adam(params=net.parameters(),lr = 0.01)
metric_func = lambda y_pred,y_true: accuracy_score(y_true.data.numpy(),y_pred.data.numpy()>0.5)
metric_name = "accuracy"

 

epochs = 10
log_step_freq = 30
dfhistory = pd.DataFrame(columns = ["epoch","loss",metric_name,"val_loss","val_"+metric_name])
print("Start Training...")
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("=========="*8 + "%s"%nowtime)
for epoch in range(1,epochs+1):
# 1, Training cycle -------------------------------------------------
net.train()
loss_sum = 0.0
metric_sum = 0.0
step = 1
for step, (features,labels) in enumerate(dl_train, 1):
# Gradient clear
optimizer.zero_grad()
# Forward propagation for loss
predictions = net(features)
loss = loss_func(predictions,labels)
metric = metric_func(predictions,labels)
# Back propagation gradient
loss.backward()
optimizer.step()
# Print batch The level of log
loss_sum += loss.item()
metric_sum += metric.item()
if step%log_step_freq == 0:
print(("[step = %d] loss: %.3f, "+metric_name+": %.3f") %
(step, loss_sum/step, metric_sum/step))
# 2, Verification cycle -------------------------------------------------
net.eval()
val_loss_sum = 0.0
val_metric_sum = 0.0
val_step = 1
for val_step, (features,labels) in enumerate(dl_valid, 1):
# Turn off gradient computation
with torch.no_grad():
predictions = net(features)
val_loss = loss_func(predictions,labels)
val_metric = metric_func(predictions,labels)
val_loss_sum += val_loss.item()
val_metric_sum += val_metric.item()
# 3, Log -------------------------------------------------
info = (epoch, loss_sum/step, metric_sum/step,
val_loss_sum/val_step, val_metric_sum/val_step)
dfhistory.loc[epoch-1] = info
# Print epoch The level of log
print(("\nEPOCH = %d, loss = %.3f,"+ metric_name + \
" = %.3f, val_loss = %.3f, "+"val_"+ metric_name+" = %.3f")
%info)
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
print('Finished Training...')

Four , Evaluation model

Let's first evaluate the effect of the model on the training set and validation set

dfhistory 

%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
def plot_metric(dfhistory, metric):
train_metrics = dfhistory[metric]
val_metrics = dfhistory['val_'+metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics, 'bo--')
plt.plot(epochs, val_metrics, 'ro-')
plt.title('Training and validation '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric, 'val_'+metric])
plt.show()
plot_metric(dfhistory,"loss")

5、 ... and , Using the model

# Prediction probability
y_pred_probs = net(torch.tensor(x_test[0:10]).float()).data
y_pred_probs

# Forecast category
y_pred = torch.where(y_pred_probs>0.5,
torch.ones_like(y_pred_probs),torch.zeros_like(y_pred_probs))
y_pred

 

 

6、 ... and , Save the model

Pytorch There are two ways to save models , All by calling pickle The serialization method implements .

The first method only saves model parameters .

The second way to save the entire model .

The first one is recommended , The second method may cause various problems when switching devices and directories .

1, Save model parameters ( recommend )

torch.save(net.state_dict(), "./data/net_parameter.pkl")
net_clone = create_net()
net_clone.load_state_dict(torch.load("./data/net_parameter.pkl"))
net_clone.forward(torch.tensor(x_test[0:10]).float()).data

 

2, Save the complete model ( Not recommended )

torch.save(net, './data/net_model.pkl')
net_loaded = torch.load('./data/net_model.pkl')
net_loaded(torch.tensor(x_test[0:10]).float()).data

 

版权声明
本文为[The sky is full of stars_]所创,转载请带上原文链接,感谢

  1. 利用Python爬虫获取招聘网站职位信息
  2. Using Python crawler to obtain job information of recruitment website
  3. Several highly rated Python libraries arrow, jsonpath, psutil and tenacity are recommended
  4. Python装饰器
  5. Python实现LDAP认证
  6. Python decorator
  7. Implementing LDAP authentication with Python
  8. Vscode configures Python development environment!
  9. In Python, how dare you say you can't log module? ️
  10. 我收藏的有关Python的电子书和资料
  11. python 中 lambda的一些tips
  12. python中字典的一些tips
  13. python 用生成器生成斐波那契数列
  14. python脚本转pyc踩了个坑。。。
  15. My collection of e-books and materials about Python
  16. Some tips of lambda in Python
  17. Some tips of dictionary in Python
  18. Using Python generator to generate Fibonacci sequence
  19. The conversion of Python script to PyC stepped on a pit...
  20. Python游戏开发,pygame模块,Python实现扫雷小游戏
  21. Python game development, pyGame module, python implementation of minesweeping games
  22. Python实用工具,email模块,Python实现邮件远程控制自己电脑
  23. Python utility, email module, python realizes mail remote control of its own computer
  24. 毫无头绪的自学Python,你可能连门槛都摸不到!【最佳学习路线】
  25. Python读取二进制文件代码方法解析
  26. Python字典的实现原理
  27. Without a clue, you may not even touch the threshold【 Best learning route]
  28. Parsing method of Python reading binary file code
  29. Implementation principle of Python dictionary
  30. You must know the function of pandas to parse JSON data - JSON_ normalize()
  31. Python实用案例,私人定制,Python自动化生成爱豆专属2021日历
  32. Python practical case, private customization, python automatic generation of Adu exclusive 2021 calendar
  33. 《Python实例》震惊了,用Python这么简单实现了聊天系统的脏话,广告检测
  34. "Python instance" was shocked and realized the dirty words and advertisement detection of the chat system in Python
  35. Convolutional neural network processing sequence for Python deep learning
  36. Python data structure and algorithm (1) -- enum type enum
  37. 超全大厂算法岗百问百答(推荐系统/机器学习/深度学习/C++/Spark/python)
  38. 【Python进阶】你真的明白NumPy中的ndarray吗?
  39. All questions and answers for algorithm posts of super large factories (recommended system / machine learning / deep learning / C + + / spark / Python)
  40. [advanced Python] do you really understand ndarray in numpy?
  41. 【Python进阶】Python进阶专栏栏主自述:不忘初心,砥砺前行
  42. [advanced Python] Python advanced column main readme: never forget the original intention and forge ahead
  43. python垃圾回收和缓存管理
  44. java调用Python程序
  45. java调用Python程序
  46. Python常用函数有哪些?Python基础入门课程
  47. Python garbage collection and cache management
  48. Java calling Python program
  49. Java calling Python program
  50. What functions are commonly used in Python? Introduction to Python Basics
  51. Python basic knowledge
  52. Anaconda5.2 安装 Python 库(MySQLdb)的方法
  53. Python实现对脑电数据情绪分析
  54. Anaconda 5.2 method of installing Python Library (mysqldb)
  55. Python implements emotion analysis of EEG data
  56. Master some advanced usage of Python in 30 seconds, which makes others envy it
  57. python爬取百度图片并对图片做一系列处理
  58. Python crawls Baidu pictures and does a series of processing on them
  59. python链接mysql数据库
  60. Python link MySQL database