官术网_书友最值得收藏!

Summary

In this chapter, we learned what the Markov chain and Markov process are and how RL problems are represented using MDP. We have also looked at the Bellman equation, and we solved the Bellman equation to derive an optimal policy using DP. In the Chapter 4, Gaming with Monte Carlo Methods, we will look at the Monte Carlo tree search and how to build intelligent games using it.

主站蜘蛛池模板: 巧家县| 上高县| 阳山县| 镇江市| 静宁县| 宜城市| 深水埗区| 罗田县| 玉溪市| 清河县| 牡丹江市| 宁明县| 甘泉县| 乌什县| 平罗县| 洛南县| 黄大仙区| 蒲城县| 布尔津县| 富蕴县| 吴川市| 会同县| 岢岚县| 南宫市| 泾川县| 尉氏县| 桦川县| 昌吉市| 鲁山县| 华蓥市| 泗水县| 石泉县| 鄂伦春自治旗| 丽水市| 成安县| 楚雄市| 原阳县| 仲巴县| 莱州市| 晴隆县| 普兰店市|