官术网_书友最值得收藏!

Decaying gamma

Decaying gamma will have the agent prioritize short-term rewards as it learns what those rewards are, and puts less emphasis on long-term rewards. 

Remember that a gamma value of 0 will cause an agent to totally disregard future values and focus only on current rewards, and that a gamma value of 1 will cause it to prioritize future values in the same way as current ones. Decaying gamma will, therefore, increase its focus onto current rewards and away from future rewards. 

Intuitively, this benefits us, because the closer we get to our goal, the more we want to take advantage of these short-term rewards instead of holding out for future rewards that won't be available after we complete the task. We can reach our goal faster and more efficiently by changing the use of the resources that we have available to us as the availability of those resources changes. 

主站蜘蛛池模板: 华安县| 韶关市| 抚远县| 巩义市| 青州市| 合江县| 博爱县| 平陆县| 乌鲁木齐市| 龙泉市| 淳安县| 皮山县| 高要市| 大同县| 陆丰市| 苏尼特右旗| 涞水县| 大渡口区| 诏安县| 桑日县| 高碑店市| 谷城县| 丽江市| 桐城市| 汉沽区| 自贡市| 克山县| 湟中县| 南平市| 长寿区| 德令哈市| 滦南县| 大石桥市| 济宁市| 柳江县| 留坝县| 吐鲁番市| 巴青县| 张家川| 双辽市| 如东县|