- Hands-On Q-Learning with Python
- Nazia Habib
- 167字
- 2021-06-24 15:13:11
Alpha – deterministic versus stochastic environments
Your agent's learning rate alpha ranges from zero to one. Setting the learning rate to zero will cause your agent to learn nothing. All of its exploration of its environment and the rewards it receives will not affect its behavior at all, and it will continue to behave completely randomly.
Setting the learning rate to one will cause your agent to learn policies that are fully specific to a deterministic environment. One important distinction to understand is between deterministic and stochastic environments and policies.
Briefly, in a deterministic environment, the output is totally determined by the initial conditions and there is no randomness involved. We always take the same action from the same state in a deterministic environment.
In a stochastic environment, there is randomness involved and the decisions that we make are given as probability distributions. In other words, we don't always take the same action from the same state.
- 構(gòu)建高質(zhì)量的C#代碼
- 21小時學(xué)通AutoCAD
- Cinema 4D R13 Cookbook
- Verilog HDL數(shù)字系統(tǒng)設(shè)計入門與應(yīng)用實例
- 網(wǎng)絡(luò)綜合布線技術(shù)
- Docker Quick Start Guide
- 可編程控制器技術(shù)應(yīng)用(西門子S7系列)
- VB語言程序設(shè)計
- 電腦日常使用與維護322問
- MPC5554/5553微處理器揭秘
- 運動控制系統(tǒng)(第2版)
- 計算智能算法及其生產(chǎn)調(diào)度應(yīng)用
- 大數(shù)據(jù)素質(zhì)讀本
- Oracle 11g Anti-hacker's Cookbook
- 中老年人學(xué)電腦與上網(wǎng)