- TensorFlow Reinforcement Learning Quick Start Guide
- Kaushik Balakrishnan
- 196字
- 2021-06-24 15:29:08
Model-free and model-based training
RL algorithms that do not learn a model of how the environment works are called model-free algorithms. By contrast, if a model of the environment is constructed, then the algorithm is called model-based. In general, if value (V) or action-value (Q) functions are used to evaluate the performance, they are called model-free algorithms as no specific model of the environment is used. On the other hand, if you build a model of how the environment transitions from one state to another or determines how many rewards the agent will receive from the environment via a model, then they are called model-based algorithms.
In model-free algorithms, as aforementioned, we do not construct a model of the environment. Thus, the agent has to take an action at a state to figure out if it is a good or a bad choice. In model-based RL, an approximate model of the environment is learned; either jointly learned along with the policy, or learned a priori. This model of the environment is used to make decisions, as well as to train the policy. We will learn more about both classes of RL algorithms in later chapters.
- 平面設計初步
- Mastering Spark for Data Science
- 智能傳感器技術與應用
- 數據庫原理與應用技術學習指導
- STM32G4入門與電機控制實戰:基于X-CUBE-MCSDK的無刷直流電機與永磁同步電機控制實現
- CorelDRAW X4中文版平面設計50例
- Photoshop CS3特效處理融會貫通
- 大數據技術與應用
- CompTIA Network+ Certification Guide
- Spark大數據技術與應用
- 數據通信與計算機網絡
- Windows游戲程序設計基礎
- Implementing Oracle API Platform Cloud Service
- 高維聚類知識發現關鍵技術研究及應用
- 信息物理系統(CPS)測試與評價技術