官术网_书友最值得收藏!

Summary

RL is one of the fundamental paradigms under the umbrella of machine learning. The principles of RL are very general and interdisciplinary, and they are not bound to a specific application.

RL considers the interaction of an agent with an external environment, taking inspiration from the human learning process. RL explicitly targets the need to explore efficiently and the exploration-exploitation trade-off appearing in almost all human problems; this is a peculiarity that distinguishes this discipline from others.

We started this chapter with a high-level description of RL, showing some interesting applications. We then introduced the main concepts of RL, describing what an agent is, what an environment is, and how an agent interacts with its environment. Finally, we implemented Gym and Baselines by showing how these libraries make RL extremely simple.

In the next chapter, we will learn more about the theory behind RL, starting with Markov chains and arriving at MDPs. We will present the two functions at the core of almost all RL algorithms, namely the state-value function, which evaluates the goodness of states, and the action-value function, which evaluates the quality of the state-action pair.

主站蜘蛛池模板: 交口县| 平乐县| 衡水市| 云林县| 新蔡县| 湖南省| 彰化市| 抚松县| 屏南县| 黄石市| 连江县| 正阳县| 铜陵市| 肇州县| 图们市| 呼和浩特市| 遂川县| 新乡市| 乐陵市| 呼和浩特市| 酒泉市| 谢通门县| 旬邑县| 邹平县| 洪湖市| 焉耆| 河东区| 巴彦淖尔市| 出国| 古丈县| 准格尔旗| 惠水县| 定兴县| 上林县| 平谷区| 霍邱县| 宜良县| 平山县| 东乌珠穆沁旗| 滨海县| 中方县|