- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 111字
- 2021-06-24 12:34:43
There's more...
If we examine the reward/episode plot, it seems that we can also stop early during training when it has been solved – the average reward over 100 consecutive episodes is no less than 195. We just add the following lines of code to the training session:
>>> if episode >= 99 and sum(total_rewards[-100:]) >= 19500:
... break
Re-run the training session. You should get something similar to the following, which stops after several hundred episodes:
Episode 1: 10.0
Episode 2: 27.0
Episode 3: 28.0
Episode 4: 15.0
Episode 5: 12.0
……
……
Episode 549: 200.0
Episode 550: 200.0
Episode 551: 200.0
Episode 552: 200.0
Episode 553: 200.0
推薦閱讀
- 現代測控系統典型應用實例
- Pig Design Patterns
- CompTIA Network+ Certification Guide
- 網絡布線與小型局域網搭建
- 經典Java EE企業應用實戰
- Mastering Ansible(Second Edition)
- 工業機器人入門實用教程
- Unreal Development Kit Game Design Cookbook
- 與人共融機器人的關節力矩測量技術
- 計算智能算法及其生產調度應用
- Hands-On Business Intelligence with Qlik Sense
- Access 2007數據庫入門與實例應用金典
- 精通ROS機器人編程(原書第2版)
- 分布式Java應用
- Getting Started with Tableau 2018.x