- Hands-On Q-Learning with Python
- Nazia Habib
- 140字
- 2021-06-24 15:13:12
SARSA versus Q-learning – on-policy or off?
Similar to Q-learning, SARSA is a model-free RL method that does not explicitly learn the agent's policy function.
The primary difference between SARSA and Q-learning is that SARSA is an on-policy method while Q-learning is an off-policy method. The effective difference between the two algorithms happens in the step where the Q-table is updated. Let's discuss what that means with some examples:

Monte Carlo tree search (MCTS) is a type of model-based RL. We won't be discussing it in detail here, but it's useful to explore further as a contrast to model-free RL algorithms. Briefly, in model-based RL, we attempt to explicitly model a value function instead of relying on sampling and observation, so that we don't have to rely as much on trial and error in the learning process.
- 數據展現的藝術
- Getting Started with Clickteam Fusion
- 傳感器技術實驗教程
- Apache Hive Essentials
- 大數據技術入門(第2版)
- Spark大數據技術與應用
- 統計學習理論與方法:R語言版
- 完全掌握AutoCAD 2008中文版:機械篇
- PowerMill 2020五軸數控加工編程應用實例
- ZigBee無線通信技術應用開發
- Linux Shell Scripting Cookbook(Third Edition)
- Redash v5 Quick Start Guide
- Machine Learning in Java
- x86/x64體系探索及編程
- R Statistics Cookbook