- Python Reinforcement Learning
- Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
- 85字
- 2021-06-24 15:17:32
The policy function
We have learned about the policy function in Chapter 1, Introduction to Reinforcement Learning, which maps the states to actions. It is denoted by π.
The policy function can be represented as , indicating mapping from states to actions. So, basically, a policy function says what action to perform in each state. Our ultimate goal lies in finding the optimal policy which specifies the correct action to perform in each state, which maximizes the reward.
推薦閱讀
- Creating Mobile Apps with Sencha Touch 2
- Visual Studio 2015 Cookbook(Second Edition)
- Python數(shù)據(jù)分析、挖掘與可視化從入門到精通
- Live Longer with AI
- Spark大數(shù)據(jù)編程實(shí)用教程
- Power BI商業(yè)數(shù)據(jù)分析完全自學(xué)教程
- Solaris操作系統(tǒng)原理實(shí)驗(yàn)教程
- 數(shù)據(jù)分析師養(yǎng)成寶典
- The Natural Language Processing Workshop
- Kubernetes快速進(jìn)階與實(shí)戰(zhàn)
- 數(shù)據(jù)產(chǎn)品經(jīng)理寶典:大數(shù)據(jù)時代如何創(chuàng)造卓越產(chǎn)品
- 算法設(shè)計與問題求解(第2版):計算思維培養(yǎng)
- TypeScript Microservices
- Working with OpenERP
- 元宇宙基石:Web3.0與分布式存儲