- Hands-On Q-Learning with Python
- Nazia Habib
- 167字
- 2021-06-24 15:13:11
Alpha – deterministic versus stochastic environments
Your agent's learning rate alpha ranges from zero to one. Setting the learning rate to zero will cause your agent to learn nothing. All of its exploration of its environment and the rewards it receives will not affect its behavior at all, and it will continue to behave completely randomly.
Setting the learning rate to one will cause your agent to learn policies that are fully specific to a deterministic environment. One important distinction to understand is between deterministic and stochastic environments and policies.
Briefly, in a deterministic environment, the output is totally determined by the initial conditions and there is no randomness involved. We always take the same action from the same state in a deterministic environment.
In a stochastic environment, there is randomness involved and the decisions that we make are given as probability distributions. In other words, we don't always take the same action from the same state.
- Word 2003、Excel 2003、PowerPoint 2003上機(jī)指導(dǎo)與練習(xí)
- Getting Started with Clickteam Fusion
- Hands-On Cybersecurity with Blockchain
- 大型數(shù)據(jù)庫管理系統(tǒng)技術(shù)、應(yīng)用與實例分析:SQL Server 2005
- Associations and Correlations
- Docker High Performance(Second Edition)
- Ceph:Designing and Implementing Scalable Storage Systems
- 網(wǎng)絡(luò)綜合布線設(shè)計與施工技術(shù)
- 工業(yè)機(jī)器人操作與編程
- 21天學(xué)通C語言
- Unity Multiplayer Games
- Learning ServiceNow
- Machine Learning Algorithms(Second Edition)
- Xilinx FPGA高級設(shè)計及應(yīng)用
- 計算機(jī)應(yīng)用基礎(chǔ)實訓(xùn)(職業(yè)模塊)