官术网_书友最值得收藏!

The policy model for optimality

Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as . is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.

主站蜘蛛池模板: 镇巴县| 金山区| 贡觉县| 济源市| 彭山县| 阿合奇县| 永清县| 鄂尔多斯市| 平遥县| 新乐市| 嘉兴市| 宜都市| 龙井市| 泽州县| 涿州市| 丹东市| 若尔盖县| 长乐市| 鄂尔多斯市| 怀化市| 奉化市| 桃园市| 孟州市| 满城县| 淮南市| 博罗县| 江孜县| 济阳县| 尉犁县| 大埔区| 临城县| 合肥市| 苍山县| 姚安县| 微博| 庆元县| 靖安县| 剑河县| 阜宁县| 保定市| 陈巴尔虎旗|