- Python Reinforcement Learning
- Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
- 214字
- 2021-06-24 15:17:32
State value function
A state value function is also called simply a value function. It specifies how good it is for an agent to be in a particular state with a policy π. A value function is often denoted by V(s). It denotes the value of a state following a policy.
We can define a state value function as follows:

This specifies the expected return starting from state s according to policy π. We can substitute the value of Rt in the value function from (2) as follows:

Note that the state value function depends on the policy and it varies depending on the policy we choose.
We can view value functions in a table. Let us say we have two states and both of these states follow the policy π. Based on the value of these two states, we can tell how good it is for our agent to be in that state following a policy. The greater the value, the better the state is:

Based on the preceding table, we can tell that it is good to be in state 2, as it has high value. We will see how to estimate these values intuitively in the upcoming sections.
- 漫話大數據
- Python數據分析、挖掘與可視化從入門到精通
- WS-BPEL 2.0 Beginner's Guide
- 大數據架構和算法實現之路:電商系統的技術實戰
- 數亦有道:Python數據科學指南
- 跨領域信息交換方法與技術(第二版)
- SQL Server 2012實施與管理實戰指南
- Oracle高性能SQL引擎剖析:SQL優化與調優機制詳解
- Doris實時數倉實戰
- 大數據時代系列(套裝9冊)
- Oracle 內核技術揭密
- 數字化轉型實踐:構建云原生大數據平臺
- Access 2010數據庫應用技術教程(第二版)
- SQL Server 2012 數據庫教程(第3版)
- Reactive Programming in Kotlin