官术网_书友最值得收藏!

The policy model for optimality

Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as . is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.

主站蜘蛛池模板: 枣庄市| 上饶县| 韩城市| 桑植县| 巴塘县| 昌宁县| 洪雅县| 宜州市| 昌邑市| 方山县| 金平| 阿克陶县| 隆回县| 津市市| 合肥市| 泽州县| 玉树县| 华容县| 文水县| 萨迦县| 泗阳县| 梁平县| 巴彦淖尔市| 遵化市| 桐乡市| 柳州市| 通道| 房产| 孝昌县| 格尔木市| 延津县| 平舆县| 儋州市| 武安市| 襄垣县| 锡林郭勒盟| 鄱阳县| 湖南省| 香格里拉县| 高平市| 辽宁省|