- TensorFlow Reinforcement Learning Quick Start Guide
- Kaushik Balakrishnan
- 257字
- 2021-06-24 15:29:05
Defining the actions of the agent
The agent performs actions to explore the environment. Obtaining this action vector is the primary goal in RL. Ideally, you need to strive to obtain optimal actions.
An action is the decision an agent takes in a certain state, st. Typically, it is represented as at, where, as before, the subscript t denotes the time instant. The actions that are available to an agent depends on the problem. For instance, an agent in a maze can decide to take a step north, or south, or east, or west. These are called discrete actions, as there are a fixed number of possibilities. On the other hand, for an autonomous car, actions can be the steering angle, throttle value, brake value, and so on, which are called continuous actions as they can take real number values in a bounded range. For example, the steering angle can be 40 degrees from the north-south line, and the throttle can be 60% down, and so on.
Thus, actions at can be either discrete or continuous, depending on the problem at hand. Some RL approaches handle discrete actions, while others are suited for continuous actions.
A schematic of the agent and its interaction with the environment is shown in the following diagram:

Now that we know what an agent is, we will look at the policies that the agent learns, what value and advantage functions are, and how these quantities are used in RL.
- 精通MATLAB圖像處理
- 計算機應(yīng)用基礎(chǔ)·基礎(chǔ)模塊
- AWS:Security Best Practices on AWS
- Apache Hive Essentials
- 模型制作
- 城市道路交通主動控制技術(shù)
- 自主研拋機器人技術(shù)
- JBoss ESB Beginner’s Guide
- CompTIA Network+ Certification Guide
- Implementing Oracle API Platform Cloud Service
- Windows Server 2008 R2活動目錄內(nèi)幕
- C#求職寶典
- 大數(shù)據(jù):引爆新的價值點
- 大話數(shù)據(jù)科學:大數(shù)據(jù)與機器學習實戰(zhàn)(基于R語言)
- 單片機原理、接口及應(yīng)用系統(tǒng)設(shè)計