官术网_书友最值得收藏!

Reinforcement learning

Reinforcement learning aims to create algorithms that can learn and adapt to environmental changes. This programming technique is based on the concept of receiving external stimuli depending on the algorithm choices. A correct choice will involve a premium while an incorrect choice will lead to a penalty. The goal of system is to achieve the best possible result, of course.

In supervised learning, there is a teacher that tells the system which is the correct output (learning with a teacher). This is not always possible. Often we have only qualitative information (sometimes binary, right/wrong, or success/failure).

The information available is called reinforcement signals. But the system does not give any information on how to update the agent's behavior (that is, weights). You cannot define a cost function or a gradient. The goal of the system is to create the smart agents that have a machinery able to learn from their experience.

This flowchart shows reinforcement learning: 

Figure 1.8: How to reinforcement learning interact with the environment
主站蜘蛛池模板: 青州市| 晋江市| 昆山市| 慈溪市| 斗六市| 那坡县| 黄大仙区| 斗六市| 马公市| 方山县| 会理县| 湾仔区| 五家渠市| 贵港市| 逊克县| 白山市| 西和县| 普格县| 丽水市| 兴仁县| 金山区| 永靖县| 建平县| 南雄市| 广州市| 酒泉市| 松原市| 东丰县| 峨山| 上饶县| 宜州市| 肇庆市| 融水| 正镶白旗| 大冶市| 施甸县| 北京市| 泗阳县| 临朐县| 潞西市| 股票|