- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 160字
- 2021-06-24 12:34:43
Markov Decision Processes and Dynamic Programming
In this chapter, we will continue our practical reinforcement learning journey with PyTorch by looking at Markov decision processes (MDPs) and dynamic programming. This chapter will start with the creation of a Markov chain and an MDP, which is the core of most reinforcement learning algorithms. You will also become more familiar with Bellman equations by practicing policy evaluation. We will then move on and apply two approaches to solving an MDP: value iteration and policy iteration. We will use the FrozenLake environment as an example. At the end of the chapter, we will demonstrate how to solve the interesting coin-flipping gamble problem with dynamic programming step by step.
The following recipes will be covered in this chapter:
- Creating a Markov chain
- Creating an MDP
- Performing policy evaluation
- Simulating the FrozenLake environment
- Solving an MDP with a value iteration algorithm
- Solving an MDP with a policy iteration algorithm
- Solving the coin-flipping gamble problem
推薦閱讀
- Instant Raspberry Pi Gaming
- ABB工業(yè)機(jī)器人編程全集
- 大數(shù)據(jù)戰(zhàn)爭(zhēng):人工智能時(shí)代不能不說的事
- 離散事件系統(tǒng)建模與仿真
- RPA(機(jī)器人流程自動(dòng)化)快速入門:基于Blue Prism
- 構(gòu)建高性能Web站點(diǎn)
- Java Web整合開發(fā)全程指南
- Storm應(yīng)用實(shí)踐:實(shí)時(shí)事務(wù)處理之策略
- 人工智能趣味入門:光環(huán)板程序設(shè)計(jì)
- 步步圖解自動(dòng)化綜合技能
- 突破,Objective-C開發(fā)速學(xué)手冊(cè)
- 電子設(shè)備及系統(tǒng)人機(jī)工程設(shè)計(jì)(第2版)
- INSTANT VMware vCloud Starter
- Photoshop CS4數(shù)碼照片處理入門、進(jìn)階與提高
- 西門子S7-1200/1500 PLC從入門到精通