- Python Reinforcement Learning
- Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
- 112字
- 2021-06-24 15:17:21
RL algorithm
The steps involved in typical RL algorithm are as follows:
- First, the agent interacts with the environment by performing an action
- The agent performs an action and moves from one state to another
- And then the agent will receive a reward based on the action it performed
- Based on the reward, the agent will understand whether the action was good or bad
- If the action was good, that is, if the agent received a positive reward, then the agent will prefer performing that action or else the agent will try performing an other action which results in a positive reward. So it is basically a trial and error learning process
推薦閱讀
- 企業(yè)大數(shù)據(jù)系統(tǒng)構(gòu)建實(shí)戰(zhàn):技術(shù)、架構(gòu)、實(shí)施與應(yīng)用
- Mastering Machine Learning with R(Second Edition)
- 數(shù)據(jù)庫(kù)技術(shù)及應(yīng)用教程
- 金融商業(yè)算法建模:基于Python和SAS
- 高維數(shù)據(jù)分析預(yù)處理技術(shù)
- 企業(yè)級(jí)容器云架構(gòu)開(kāi)發(fā)指南
- 數(shù)據(jù)分析師養(yǎng)成寶典
- 區(qū)塊鏈+:落地場(chǎng)景與應(yīng)用實(shí)戰(zhàn)
- 大數(shù)據(jù)與機(jī)器學(xué)習(xí):實(shí)踐方法與行業(yè)案例
- Hands-On System Programming with C++
- 大數(shù)據(jù)分析:R基礎(chǔ)及應(yīng)用
- 大數(shù)據(jù)時(shí)代系列(套裝9冊(cè))
- 一本書讀懂大數(shù)據(jù)
- 數(shù)據(jù)庫(kù)技術(shù)與應(yīng)用:SQL Server 2008
- Configuration Management with Chef-Solo