書名： PyTorch 1.x Reinforcement Learning Cookbook
作者名： Yuxi (Hayden) Liu
本章字?jǐn)?shù)： 160字
更新時間： 2021-06-24 12:34:43

Markov Decision Processes and Dynamic Programming

In this chapter, we will continue our practical reinforcement learning journey with PyTorch by looking at Markov decision processes (MDPs) and dynamic programming. This chapter will start with the creation of a Markov chain and an MDP, which is the core of most reinforcement learning algorithms. You will also become more familiar with Bellman equations by practicing policy evaluation. We will then move on and apply two approaches to solving an MDP: value iteration and policy iteration. We will use the FrozenLake environment as an example. At the end of the chapter, we will demonstrate how to solve the interesting coin-flipping gamble problem with dynamic programming step by step.

The following recipes will be covered in this chapter:

Creating a Markov chain
Creating an MDP
Performing policy evaluation
Simulating the FrozenLake environment
Solving an MDP with a value iteration algorithm
Solving an MDP with a policy iteration algorithm
Solving the coin-flipping gamble problem

官术网_书友最值得收藏!

PyTorch 1.x Reinforcement Learning Cookbook

Markov Decision Processes and Dynamic Programming