官术网_书友最值得收藏!

Discount factor

We have seen that an agent goal is to maximize the return. For an episodic task, we can define our return as Rt= rt+1 + rt+2 + ..... +rT, where T is the final state of the episode, and we try to maximize the return Rt.

Since we don't have any final state for a continuous task, we can define our return for continuous tasks as Rt= rt+1 + rt+2+....,which sums up to infinity. But how can we maximize the return if it never stops?

That's why we introduce the notion of a discount factor. We can redefine our return with a discount factor , as follows:

  ---(1)
          ---(2) 

The discount factor decides how much importance we give to the future rewards and immediate rewards. The value of the discount factor lies within 0 to 1. A discount factor of 0 means that immediate rewards are more important, while a discount factor of 1 would mean that future rewards are more important than immediate rewards.

A discount factor of 0 will never learn considering only the immediate rewards; similarly, a discount factor of 1 will learn forever looking for the future reward, which may lead to infinity. So the optimal value of the discount factor lies between 0.2 to 0.8. 

We give importance to immediate rewards and future rewards depending on the use case. In some cases, future rewards are more desirable than immediate rewards and vice versa. In a chess game, the goal is to defeat the opponent's king. If we give importance to the immediate reward, which is acquired by actions like our pawn defeating any opponent player and so on, the agent will learn to perform this sub-goal instead of learning to reach the actual goal. So, in this case, we give importance to future rewards, whereas in some cases, we prefer immediate rewards over future rewards. (Say, would you prefer chocolates if I gave you them today or 13 months later?)

主站蜘蛛池模板: 株洲县| 政和县| 安丘市| 乐平市| 黑龙江省| 中西区| 会宁县| 榆中县| 大埔县| 香港 | 平阴县| 昆明市| 九寨沟县| 龙陵县| 桂林市| 庆安县| 黑龙江省| 峨山| 天镇县| 商洛市| 家居| 香河县| 思南县| 当涂县| 泾源县| 武胜县| 莱芜市| 浮梁县| 建始县| 娄烦县| 淄博市| 登封市| 两当县| 乌审旗| 尼勒克县| 都安| 临沭县| 衡东县| 大同县| 辰溪县| 旌德县|