- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 154字
- 2021-06-24 12:34:43
Developing a policy gradient algorithm
The last recipe of the first chapter is about solving the CartPole environment with a policy gradient algorithm. This may be more complicated than we need for this simple problem, in which the random search and hill-climbing algorithms suffice. However, it is a great algorithm to learn, and we will use it in more complicated environments later in the book.
In the policy gradient algorithm, the model weight moves in the direction of the gradient at the end of each episode. We will explain the computation of gradients in the next section. Also, in each step, it samples an action from the policy based on the probabilities computed using the state and weight. It no longer takes an action with certainty, in contrast with random search and hill climbing (by taking the action achieving the higher score). Hence, the policy switches from deterministic to stochastic.
- 網頁編程技術
- Visual C++編程全能詞典
- 樂高機器人—槍械武器庫
- 運動控制系統應用與實踐
- 學會VBA,菜鳥也高飛!
- Grome Terrain Modeling with Ogre3D,UDK,and Unity3D
- Python:Data Analytics and Visualization
- Deep Reinforcement Learning Hands-On
- 中國戰略性新興產業研究與發展·智能制造裝備
- Extending Ansible
- Learning ServiceNow
- 大數據案例精析
- MATLAB-Simulink系統仿真超級學習手冊
- 基于RPA技術財務機器人的應用與研究
- 機器人制作入門(第4版)