官术网_书友最值得收藏!

  • Python Reinforcement Learning
  • Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
  • 338字
  • 2021-06-24 15:17:31

Discount factor

We have seen that an agent goal is to maximize the return. For an episodic task, we can define our return as Rt= rt+1 + rt+2 + ..... +rT, where T is the final state of the episode, and we try to maximize the return Rt.

Since we don't have any final state for a continuous task, we can define our return for continuous tasks as Rt= rt+1 + rt+2+....,which sums up to infinity. But how can we maximize the return if it never stops?

That's why we introduce the notion of a discount factor. We can redefine our return with a discount factor , as follows:

  ---(1)
          ---(2) 

The discount factor decides how much importance we give to the future rewards and immediate rewards. The value of the discount factor lies within 0 to 1. A discount factor of 0 means that immediate rewards are more important, while a discount factor of 1 would mean that future rewards are more important than immediate rewards.

A discount factor of 0 will never learn considering only the immediate rewards; similarly, a discount factor of 1 will learn forever looking for the future reward, which may lead to infinity. So the optimal value of the discount factor lies between 0.2 to 0.8. 

We give importance to immediate rewards and future rewards depending on the use case. In some cases, future rewards are more desirable than immediate rewards and vice versa. In a chess game, the goal is to defeat the opponent's king. If we give importance to the immediate reward, which is acquired by actions like our pawn defeating any opponent player and so on, the agent will learn to perform this sub-goal instead of learning to reach the actual goal. So, in this case, we give importance to future rewards, whereas in some cases, we prefer immediate rewards over future rewards. (Say, would you prefer chocolates if I gave you them today or 13 months later?)

主站蜘蛛池模板: 汝州市| 房山区| 涟水县| 玉田县| 平舆县| 和政县| 历史| 武夷山市| 政和县| 隆回县| 白玉县| 错那县| 宁陕县| 海南省| 邵阳市| 黄山市| 邢台县| 会昌县| 巴彦淖尔市| 西峡县| 营口市| 库尔勒市| 贵州省| 老河口市| 石首市| 永吉县| 陇南市| 丰宁| 丰宁| 平江县| 永登县| 龙江县| 怀集县| 泰顺县| 台南县| 古蔺县| 皮山县| 巴里| 大余县| 宁武县| 陆河县|