- Python Reinforcement Learning Projects
- Sean Saito Yang Wenzhuo Rajalingappaa Shanmugamani
- 305字
- 2021-07-23 19:05:00
The agent
The goal of a reinforcement learning agent is to learn to perform a task well in an environment. Mathematically, this means to maximize the cumulative reward, R, which can be expressed in the following equation:
We are simply calculating a weighted sum of the reward received at each timestep.is called the discount factor, which is a scalar value between 0 and 1. The idea is that the later a reward comes, the less valuable it becomes. This reflects our perspectives on rewards as well; that we'd rather receive $100 now rather than a year later shows how the same reward signal can be valued differently based on its proximity to the present.
Because the mechanics of the environment are not fully observable or known to the agent, it must gain information by performing an action and observing how the environment reacts to it. This is much like how humans learn to perform certain tasks as well.
Suppose we are learning to play chess. While we don't have all the possible moves committed to memory or know exactly how an opponent will play, we are able to improve our proficiency over time. In particular, we are able to become proficient in the following:
- Learning how to react to a move made by the opponent
- Assessing how good of a position we are in to win the game
- Predicting what the opponent will do next and using that prediction to decide on a move
- Understanding how others would play in a similar situation
In fact, reinforcement learning agents can learn to do similar things. In particular, an agent can be composed of multiple functions and models to assist its decision-making. There are three main components that an agent can have: the policy, the value function, and the model.
- 32位嵌入式系統(tǒng)與SoC設(shè)計導(dǎo)論
- 計算機(jī)應(yīng)用
- Practical Data Wrangling
- 機(jī)器學(xué)習(xí)與大數(shù)據(jù)技術(shù)
- Data Wrangling with Python
- Implementing Oracle API Platform Cloud Service
- 聊天機(jī)器人:入門、進(jìn)階與實戰(zhàn)
- 計算機(jī)組網(wǎng)技術(shù)
- Building a BeagleBone Black Super Cluster
- Linux嵌入式系統(tǒng)開發(fā)
- Apache源代碼全景分析(第1卷):體系結(jié)構(gòu)與核心模塊
- Photoshop CS4數(shù)碼照片處理入門、進(jìn)階與提高
- 智能+:制造業(yè)的智能化轉(zhuǎn)型
- 軟測之魂
- CPLD/FPGA技術(shù)應(yīng)用