Learning PPO algorithm programming from scratch (Python version)

osc_ 3g4j2ghj 2021-01-21 10:29:33
learning ppo algorithm programming scratch

Learn from scratch PPO Algorithm programming (pytorch edition )( One )

These articles introduce the use of Pytorch Conduct PPO( Near end strategy optimization ) Algorithm programming . This article is from the Internet PPO Learning practice is writing while learning , Hope to smooth out the whole process .

This article begins with a general introduction to writing PPO The flow of the algorithm and the files used .

Study PPO The foundation of algorithmic programming :Python,pytorch, Reinforcement learning , Introduction to strategy gradient algorithm ,PPO Theoretical knowledge of . Here are some learning references :
Intuitive understanding PPO Algorithm
PPO Algorithm 【 Theory Chapter 】
PPO The algorithm is easy to understand
PG Algorithm
Strategy gradient descent algorithm
Strengthen learning and knowledge arrangement

Refer to the online tutorial for practice , First, the training code is divided into 4 File , Namely main.py,ppo.py,network.py and arguments.py.
arguments.py: Parsing command line arguments ,main Function can call .
main.py: Executable file , Use arguments.py Parsing command line arguments , Initialize the environment and PPO Model .
PPO.py: preservation PPO Model
network.py: Used in PPO Defined in the model Actor-Critic The neural network module of the network , It contains a feedforward neural network .
Actor-Critic The model is periodically saved to a binary file ppo_actor.pth and ppo_critic.pth in , You can load them as you test or continue training .

The test code is mainly located in eval_policy.py in , from main.py call .
eval_policy.py: Test the trained strategy in the specified environment , This module is completely independent of all other files .

 The training process
 Testing process
Reference resources :
Coding PPO from Scratch with PyTorch

本文为[osc_ 3g4j2ghj]所创,转载请带上原文链接,感谢

  1. Python批量 png转ico
  2. 使用line_profiler对python代码性能进行评估优化
  3. 使用line_profiler对python代码性能进行评估优化
  4. Getting started with Python 3 flash in win environment
  5. Common ways to write configuration files in Python
  6. Python会在2021年死去吗? Python 3.9最终版本的回顾
  7. Python batch PNG to ICO
  8. Using line_ Profiler evaluates and optimizes the performance of Python code
  9. Using line_ Profiler evaluates and optimizes the performance of Python code
  10. Will Python die in 2021? A review of the final version of Python 3.9
  11. Python3 SMTP send mail
  12. Understanding closures in Python: getting started with closures
  13. Python日志实践
  14. Python logging practice
  15. [python opencv 计算机视觉零基础到实战] 十、图片效果毛玻璃
  16. [python opencv 计算机视觉零基础到实战] 九、模糊
  17. 10. Picture effect ground glass
  18. [Python opencv computer vision zero basis to actual combat] 9. Fuzzy
  19. 使用line_profiler對python程式碼效能進行評估優化
  20. Using line_ Profiler to evaluate and optimize the performance of Python code
  21. LeetCode | 0508. 出现次数最多的子树元素和【Python】
  22. Leetcode | 0508
  23. LeetCode | 0530. 二叉搜索树的最小绝对差【Python】
  24. LeetCode | 0515. 在每个树行中找最大值【Python】
  25. Leetcode | 0530. Minimum absolute difference of binary search tree [Python]
  26. Leetcode | 0515. Find the maximum value in each tree row [Python]
  27. 我来记笔记啦-搭建python虚拟环境
  28. Let me take notes - building a python virtual environment
  29. LeetCode | 0513. 找树左下角的值【Python】
  30. Leetcode | 0513. Find the value in the lower left corner of the tree [Python]
  31. Python OpenCV 泛洪填充,取经之旅第 21 天
  32. Python opencv flood fill, day 21
  33. Python爬虫自学系列(二)
  34. Python crawler self study series (2)
  35. 【python】身份证号码有效性检验
  36. [Python] validity test of ID number
  37. Python ORM - pymysql&sqlalchemy
  38. Python ORM - pymysql&sqlalchemy
  39. centos7 安装python3.8
  40. centos7 安装python3.8
  41. Centos7 installing Python 3.8
  42. Centos7 installing Python 3.8
  43. Django——图书管理系统(六)
  44. Django——图书管理系统(五)
  45. Django -- library management system (6)
  46. Django -- library management system (5)
  47. python批量插入数据小脚本
  48. Python batch insert data script
  49. ZoomEye-python 使用指南
  50. Zoomeye Python User's Guide
  51. 用Python写代码,一分钟搞定一天工作量,同事直呼:好家伙 - 知乎
  52. Using Python to write code, one minute to complete a day's workload, colleagues call: good guy - Zhihu
  53. Python 上的可视化库——PyG2Plot
  54. Pyg2plot: a visualization library on Python
  55. Python 上的可视化库——PyG2Plot
  56. Python实用代码-无限级分类树状结构生成算法
  57. Pyg2plot: a visualization library on Python
  58. Python utility code - infinite classification tree structure generation algorithm
  59. 奇技淫巧,还是正统功夫?Python推导式最全用法
  60. Pandas 的这个知识点,估计 80% 的人都得挂!