官术网_书友最值得收藏!

Chapter 6. Deep Q-Networks

In the previous chapter, we became familiar with the Bellman equation and the practical method of its application called Value iteration. This approach allowed us to significantly improve our speed and convergence in the FrozenLake environment, which is promising, but can we go further?

In this chapter, we'll try to apply the same theory to problems of much greater complexity: arcade games from the Atari 2600 platform, which are the de-facto benchmark of the RL research community. To deal with this new and more challenging goal, we'll talk about problems with the Value iteration method and introduce its variation, called Q-learning. In particular, we'll look at the application of Q-learning to so-called "grid world" environments, which is called tabular Q-learning, and then we'll discuss Q-learning in conjunction with neural networks. This combination has the name DQN. At the end of the chapter, we'll reimplement a DQN algorithm from the famous paper, Playing Atari with Deep Reinforcement Learning by V. Mnih and others, published in 2013, which started a new era in RL development.

主站蜘蛛池模板: 潞西市| 桦甸市| 万安县| 舟曲县| 谢通门县| 大厂| 五原县| 冀州市| 涿州市| 乌苏市| 藁城市| 正阳县| 兴安盟| 河西区| 霍城县| 日喀则市| 麟游县| 普宁市| 全州县| 沙田区| 太保市| 高台县| 竹北市| 边坝县| 牙克石市| 娄底市| 吉安县| 临颍县| 顺昌县| 响水县| 武陟县| 靖西县| 沐川县| 安西县| 萨嘎县| 神农架林区| 乌兰浩特市| 宣城市| 高阳县| 安岳县| 海林市|