書名： Python Reinforcement Learning
作者名： Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
本章字?jǐn)?shù)： 85字
更新時間： 2021-06-24 15:17:32

The policy function

We have learned about the policy function in Chapter 1, Introduction to Reinforcement Learning, which maps the states to actions. It is denoted by π.

The policy function can be represented as , indicating mapping from states to actions. So, basically, a policy function says what action to perform in each state. Our ultimate goal lies in finding the optimal policy which specifies the correct action to perform in each state, which maximizes the reward.

官术网_书友最值得收藏!

Python Reinforcement Learning

The policy function