官术网_书友最值得收藏!

Reinforcement learning

Reinforcement learning aims to create algorithms that can learn and adapt to environmental changes. This programming technique is based on the concept of receiving external stimuli depending on the algorithm choices. A correct choice will involve a premium while an incorrect choice will lead to a penalty. The goal of system is to achieve the best possible result, of course.

In supervised learning, there is a teacher that tells the system which is the correct output (learning with a teacher). This is not always possible. Often we have only qualitative information (sometimes binary, right/wrong, or success/failure).

The information available is called reinforcement signals. But the system does not give any information on how to update the agent's behavior (that is, weights). You cannot define a cost function or a gradient. The goal of the system is to create the smart agents that have a machinery able to learn from their experience.

This flowchart shows reinforcement learning: 

Figure 1.8: How to reinforcement learning interact with the environment
主站蜘蛛池模板: 六枝特区| 白银市| 韶关市| 宽甸| 额济纳旗| 洞口县| 木兰县| 伊宁县| 正镶白旗| 德保县| 策勒县| 麻城市| 河西区| 兴和县| 桂东县| 深水埗区| 玛纳斯县| 玉树县| 游戏| 长宁县| 桑植县| 阿勒泰市| 铁力市| 上杭县| 扎兰屯市| 丁青县| 抚顺县| 法库县| 思南县| 乌鲁木齐县| 荆州市| 仙桃市| 香河县| 枣强县| 百色市| 万年县| 阿拉尔市| 徐汇区| 襄樊市| 莎车县| 大悟县|