- TensorFlow Reinforcement Learning Quick Start Guide
- Kaushik Balakrishnan
- 124字
- 2021-06-24 15:29:09
Understanding TD learning
We will first learn about TD learning. This is a very fundamental concept in RL. In TD learning, the learning of the agent is attained by experience. Several trial episodes are undertaken of the environment, and the rewards accrued are used to update the value functions. Specifically, the agent will keep an update of the state-action value functions as it experiences new states/actions. The Bellman equation is used to update this state-action value function, and the goal is to minimize the TD error. This essentially means the agent is reducing its uncertainty of which action is the optimal action in a given state; it gains confidence on the optimal action in a given state by lowering the TD error.
推薦閱讀
- 玩轉智能機器人程小奔
- Dreamweaver CS3網頁制作融會貫通
- 控制與決策系統仿真
- 并行數據挖掘及性能優化:關聯規則與數據相關性分析
- 影視后期制作(Avid Media Composer 5.0)
- 電腦上網直通車
- 樂高創意機器人教程(中級 下冊 10~16歲) (青少年iCAN+創新創意實踐指導叢書)
- AWS Administration Cookbook
- CompTIA Network+ Certification Guide
- 具比例時滯遞歸神經網絡的穩定性及其仿真與應用
- PLC與變頻技術應用
- 空間機器人
- 貫通開源Web圖形與報表技術全集
- 案例解說Delphi典型控制應用
- 基于Proteus的PIC單片機C語言程序設計與仿真