- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 81字
- 2021-08-27 18:51:57
The policy model for optimality
Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as .
is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.
推薦閱讀
- 嵌入式系統及其開發應用
- 輕輕松松自動化測試
- Deep Learning Quick Reference
- Dreamweaver CS3網頁設計與網站建設詳解
- Windows程序設計與架構
- Blender Compositing and Post Processing
- Windows游戲程序設計基礎
- Troubleshooting OpenVPN
- 機器人人工智能
- 重估:人工智能與賦能社會
- Drupal高手建站技術手冊
- 時序大數據平臺TDengine核心原理與實戰
- Android High Performance Programming
- 仿蛇機器人的設計與制作
- Robust Cloud Integration with Azure