官术网_书友最值得收藏!

Defining the actions of the agent

The agent performs actions to explore the environment. Obtaining this action vector is the primary goal in RL. Ideally, you need to strive to obtain optimal actions.

An action is the decision an agent takes in a certain state, st. Typically, it is represented as at, where, as before, the subscript t denotes the time instant. The actions that are available to an agent depends on the problem. For instance, an agent in a maze can decide to take a step north, or south, or east, or west. These are called discrete actions, as there are a fixed number of possibilities. On the other hand, for an autonomous car, actions can be the steering angle, throttle value, brake value, and so on, which are called continuous actions as they can take real number values in a bounded range. For example, the steering angle can be 40 degrees from the north-south line, and the throttle can be 60% down, and so on.

Thus, actions at can be either discrete or continuous, depending on the problem at hand. Some RL approaches handle discrete actions, while others are suited for continuous actions.

A schematic of the agent and its interaction with the environment is shown in the following diagram:

Figure 1: Schematic showing the agent and its interaction with the environment

Now that we know what an agent is, we will look at the policies that the agent learns, what value and advantage functions are, and how these quantities are used in RL.

主站蜘蛛池模板: 自贡市| 涿州市| 铜梁县| 法库县| 尚义县| 织金县| 若尔盖县| 龙川县| 泰顺县| 柳河县| 罗平县| 中山市| 乳源| 富平县| 五大连池市| 乌拉特后旗| 伊宁市| 武汉市| 永清县| 光山县| 秦皇岛市| 松潘县| 保亭| 双流县| 习水县| 姚安县| 于都县| 吴川市| 西吉县| 太和县| 安化县| 班戈县| 荥阳市| 南昌县| 和田县| 荆门市| 新绛县| 大石桥市| 炉霍县| 东阿县| 如皋市|