- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 234字
- 2021-08-27 18:51:57
Basic terminologies and conventions
The following are the basic terminologies associated with reinforcement learning:
- Agent: This we create by programming such that it is able to sense the environment, perform actions, receive feedback, and try to maximize rewards.
- Environment: The world where the agent resides. It can be real or simulated.
- State: The perception or configuration of the environment that the agent senses. State spaces can be finite or infinite.
- Rewards: Feedback the agent receives after any action it has taken. The goal of the agent is to maximize the overall reward, that is, the immediate and the future reward. Rewards are defined in advance. Therefore, they must be created properly to achieve the goal efficiently.
- Actions: Anything that the agent is capable of doing in the given environment. Action space can be finite or infinite.
- SAR triple: (state, action, reward) is referred as the SAR triple, represented as (s, a, r).
- Episode: Represents one complete run of the whole task.
Let's deduce the convention shown in the following diagram:

Every task is a sequence of SAR triples. We start from state S(t), perform action A(t) and thereby, receive a reward R(t+1), and land on a new state S(t+1). The current state and action pair gives rewards for the next step. Since, S(t) and A(t) results in S(t+1), we have a new triple of (current state, action, new state), that is, [S(t),A(t),S(t+1)] or (s,a,s').
推薦閱讀
- Apache Hive Essentials
- JMAG電機電磁仿真分析與實例解析
- 視覺檢測技術及智能計算
- JBoss ESB Beginner’s Guide
- Multimedia Programming with Pure Data
- 人工智能實踐錄
- 電腦主板現場維修實錄
- Mastering Game Development with Unreal Engine 4(Second Edition)
- MCGS嵌入版組態軟件應用教程
- Word 2007,Excel 2007辦公應用融會貫通
- 基于敏捷開發的數據結構研究
- Linux Shell編程從初學到精通
- Web璀璨:Silverlight應用技術完全指南
- 未來學徒:讀懂人工智能飛馳時代
- Effective Business Intelligence with QuickSight