官术网_书友最值得收藏!

Summary

In this chapter, we learned about OpenAI Gym, including the installation of different important functions to load, render, and understand the environment state-action spaces. We learned about the Epsilon-Greedy approach as a solution to the exploration-exploitation dilemma, and tried to implement a basic Q-learning and Q-network algorithm to train a reinforcement-learning agent to navigate an environment from OpenAI Gym.

In the next chapter, we will cover the most fundamental concepts in Reinforcement Learning, which include Markov Decision Processes (MDPs), Bellman Equation, and Markov Chain Monte Carlo.

主站蜘蛛池模板: 吉隆县| 佛冈县| 平远县| 蓬莱市| 石楼县| 横山县| 类乌齐县| 济源市| 东台市| 太仓市| 三河市| 班戈县| 始兴县| 梧州市| 新宁县| 京山县| 金溪县| 宁强县| 工布江达县| 莱西市| 从化市| 武穴市| 筠连县| 博罗县| 梧州市| 沙洋县| 呼玛县| 大悟县| 芜湖县| 石狮市| 霞浦县| 吉首市| 天峨县| 凯里市| 雷波县| 昭通市| 新野县| 乌拉特后旗| 项城市| 肥西县| 台南市|