官术网_书友最值得收藏!

States, actions, and rewards

What does it mean to be in a state, to take an action, or to receive a reward? These are the most important concepts for us to understand intuitively, so let's dig deeper into them. The following diagram depicts the agent-environment interaction in an MDP:

The agent interacts with the environment through actions, and it receives rewards and state information from the environment. In other words, the states and rewards are feedback from the environment, and the actions are inputs to the environment from the agent. 

Going back to our simple driving simulator example, our agent might be moving or stopped at a red light, turning left or right, or heading straight. There might be other cars in the intersection, or there might not be. Our distance from the destination will be X units.

主站蜘蛛池模板: 卢湾区| 沁水县| 内江市| 平度市| 正镶白旗| 普定县| 平阳县| 萨嘎县| 嵊泗县| 秀山| 辉县市| 阳原县| 贵港市| 五峰| 吕梁市| 南澳县| 基隆市| 青浦区| 营口市| 雷州市| 绵阳市| 蒙阴县| 崇左市| 礼泉县| 上思县| 徐水县| 报价| 缙云县| 扎兰屯市| 临桂县| 青岛市| 通榆县| 呼和浩特市| 会理县| 汉中市| 龙口市| 新丰县| 定远县| 儋州市| 锡林郭勒盟| 哈巴河县|