官术网_书友最值得收藏!

Chapter 6. Deep Q-Networks

In the previous chapter, we became familiar with the Bellman equation and the practical method of its application called Value iteration. This approach allowed us to significantly improve our speed and convergence in the FrozenLake environment, which is promising, but can we go further?

In this chapter, we'll try to apply the same theory to problems of much greater complexity: arcade games from the Atari 2600 platform, which are the de-facto benchmark of the RL research community. To deal with this new and more challenging goal, we'll talk about problems with the Value iteration method and introduce its variation, called Q-learning. In particular, we'll look at the application of Q-learning to so-called "grid world" environments, which is called tabular Q-learning, and then we'll discuss Q-learning in conjunction with neural networks. This combination has the name DQN. At the end of the chapter, we'll reimplement a DQN algorithm from the famous paper, Playing Atari with Deep Reinforcement Learning by V. Mnih and others, published in 2013, which started a new era in RL development.

主站蜘蛛池模板: 宁远县| 江都市| 文水县| 建湖县| 环江| 嵩明县| 大邑县| 安岳县| 安塞县| 安龙县| 中西区| 楚雄市| 潞城市| 富阳市| 青海省| 呼和浩特市| 无棣县| 哈密市| 昭平县| 台东县| 海盐县| 东宁县| 永顺县| 密云县| 尤溪县| 都江堰市| 诸暨市| 遂溪县| 通城县| 新龙县| 黎城县| 怀化市| 阳谷县| 莱州市| 林州市| 英德市| 北碚区| 北海市| 铜山县| 扎兰屯市| 兰溪市|