官术网_书友最值得收藏!

Gaming with Monte Carlo Methods

Monte Carlo is one of the most popular and most commonly used algorithms in various fields ranging from physics and mechanics to computer science. The Monte Carlo algorithm is used in reinforcement learning (RL) when the model of the environment is not known. In the previous chapter, we looked at using dynamic programming (DP) to find an optimal policy where we know the model dynamics, which is transition and reward probabilities. But how can we determine the optimal policy when we don't know the model dynamics? In that case, we use the Monte Carlo algorithm; it is extremely powerful for finding optimal policies when we don't have knowledge of the environment.

In this chapter, you will learn about the following:

  • Monte Carlo methods
  • Monte Carlo prediction
  • Playing Blackjack with Monte Carlo
  • Model Carlo control
  • Monte Carlo exploration starts 
  • On-policy Monte Carlo control
  • Off-policy Monte Carlo control
主站蜘蛛池模板: 达拉特旗| 舟山市| 房山区| 手游| 磐安县| 阳谷县| 太湖县| 定州市| 古丈县| 怀远县| 海门市| 靖远县| 仪陇县| 金华市| 金秀| 宿松县| 余江县| 玛沁县| 康平县| 砀山县| 武宣县| 德令哈市| 舟山市| 历史| 朝阳市| 磐石市| 招远市| 湛江市| 永德县| 济源市| 辽阳县| 保靖县| 洪洞县| 香格里拉县| 泽普县| 茌平县| 晴隆县| 通化县| 富顺县| 邯郸县| 肇源县|