- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 234字
- 2021-08-27 18:51:57
Basic terminologies and conventions
The following are the basic terminologies associated with reinforcement learning:
- Agent: This we create by programming such that it is able to sense the environment, perform actions, receive feedback, and try to maximize rewards.
- Environment: The world where the agent resides. It can be real or simulated.
- State: The perception or configuration of the environment that the agent senses. State spaces can be finite or infinite.
- Rewards: Feedback the agent receives after any action it has taken. The goal of the agent is to maximize the overall reward, that is, the immediate and the future reward. Rewards are defined in advance. Therefore, they must be created properly to achieve the goal efficiently.
- Actions: Anything that the agent is capable of doing in the given environment. Action space can be finite or infinite.
- SAR triple: (state, action, reward) is referred as the SAR triple, represented as (s, a, r).
- Episode: Represents one complete run of the whole task.
Let's deduce the convention shown in the following diagram:

Every task is a sequence of SAR triples. We start from state S(t), perform action A(t) and thereby, receive a reward R(t+1), and land on a new state S(t+1). The current state and action pair gives rewards for the next step. Since, S(t) and A(t) results in S(t+1), we have a new triple of (current state, action, new state), that is, [S(t),A(t),S(t+1)] or (s,a,s').
推薦閱讀
- Cinema 4D R13 Cookbook
- 蕩胸生層云:C語言開發修行實錄
- Hadoop 2.x Administration Cookbook
- 程序設計語言與編譯
- Learning Social Media Analytics with R
- JBoss ESB Beginner’s Guide
- 永磁同步電動機變頻調速系統及其控制(第2版)
- JSP從入門到精通
- 信息物理系統(CPS)測試與評價技術
- Machine Learning with Apache Spark Quick Start Guide
- Mastering ServiceNow Scripting
- SAP Business Intelligence Quick Start Guide
- Unity Multiplayer Games
- Introduction to R for Business Intelligence
- Artificial Intelligence By Example