- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 81字
- 2021-08-27 18:51:57
The policy model for optimality
Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as .
is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.
推薦閱讀
- 影視后期制作(Avid Media Composer 5.0)
- 輕松學(xué)Java
- Photoshop CS3圖層、通道、蒙版深度剖析寶典
- 數(shù)據(jù)庫系統(tǒng)原理及應(yīng)用教程(第5版)
- 激光選區(qū)熔化3D打印技術(shù)
- C++程序設(shè)計基礎(chǔ)(上)
- Learn QGIS
- Java組件設(shè)計
- Raspberry Pi Projects for Kids
- Microsoft System Center Data Protection Manager Cookbook
- ROS Robotics By Example(Second Edition)
- 系統(tǒng)建模與控制導(dǎo)論
- R Statistics Cookbook
- R:Predictive Analysis
- C++面向?qū)ο蟪绦蛟O(shè)計