## Learning PPO algorithm programming from scratch (Python version)

osc_ 3g4j2ghj 2021-01-21 10:29:33
learning ppo algorithm programming scratch

# Learn from scratch PPO Algorithm programming （pytorch edition ）（ One ）

These articles introduce the use of Pytorch Conduct PPO（ Near end strategy optimization ） Algorithm programming . This article is from the Internet PPO Learning practice is writing while learning , Hope to smooth out the whole process .

This article begins with a general introduction to writing PPO The flow of the algorithm and the files used .

Study PPO The foundation of algorithmic programming ：Python,pytorch, Reinforcement learning , Introduction to strategy gradient algorithm ,PPO Theoretical knowledge of . Here are some learning references ：
Intuitive understanding PPO Algorithm
PPO Algorithm 【 Theory Chapter 】
PPO The algorithm is easy to understand
PG Algorithm
Strengthen learning and knowledge arrangement

Refer to the online tutorial for practice , First, the training code is divided into 4 File , Namely main.py,ppo.py,network.py and arguments.py.
arguments.py： Parsing command line arguments ,main Function can call .
main.py： Executable file , Use arguments.py Parsing command line arguments , Initialize the environment and PPO Model .
PPO.py： preservation PPO Model
network.py： Used in PPO Defined in the model Actor-Critic The neural network module of the network , It contains a feedforward neural network .
Actor-Critic The model is periodically saved to a binary file ppo_actor.pth and ppo_critic.pth in , You can load them as you test or continue training .

The test code is mainly located in eval_policy.py in , from main.py call .
eval_policy.py： Test the trained strategy in the specified environment , This module is completely independent of all other files .  Reference resources ：
Coding PPO from Scratch with PyTorch

https://pythonmana.com/2021/01/20210121102234369V.html