官术网_书友最值得收藏!

Summary

In this chapter, we learned about OpenAI Gym, including the installation of different important functions to load, render, and understand the environment state-action spaces. We learned about the Epsilon-Greedy approach as a solution to the exploration-exploitation dilemma, and tried to implement a basic Q-learning and Q-network algorithm to train a reinforcement-learning agent to navigate an environment from OpenAI Gym.

In the next chapter, we will cover the most fundamental concepts in Reinforcement Learning, which include Markov Decision Processes (MDPs), Bellman Equation, and Markov Chain Monte Carlo.

主站蜘蛛池模板: 林芝县| 灵山县| 吐鲁番市| 博乐市| 衡阳市| 河西区| 资溪县| 澜沧| 通许县| 滨州市| 应城市| 烟台市| 红原县| 武安市| 平果县| 思南县| 屏东县| 临洮县| 饶平县| 饶平县| 宜城市| 溧水县| 五常市| 城固县| 溧水县| 南涧| 邮箱| 壶关县| 资阳市| 望奎县| 岚皋县| 商城县| 晋州市| 理塘县| 海盐县| 天等县| 济南市| 镇安县| 山阴县| 黄浦区| 遂平县|