舉報

會員
PyTorch 1.x Reinforcement Learning Cookbook
Reinforcementlearning(RL)isabranchofmachinelearningthathasgainedpopularityinrecenttimes.ItallowsyoutotrainAImodelsthatlearnfromtheirownactionsandoptimizetheirbehavior.PyTorchhasalsoemergedasthepreferredtoolfortrainingRLmodelsbecauseofitsefficiencyandeaseofuse.Withthisbook,you'llexploretheimportantRLconceptsandtheimplementationofalgorithmsinPyTorch1.x.Therecipesinthebook,alongwithreal-worldexamples,willhelpyoumastervariousRLtechniques,suchasdynamicprogramming,MonteCarlosimulations,temporaldifference,andQ-learning.You'llalsogaininsightsintoindustry-specificapplicationsofthesetechniques.Laterchapterswillguideyouthroughsolvingproblemssuchasthemulti-armedbanditproblemandthecartpoleproblemusingthemulti-armedbanditalgorithmandfunctionapproximation.You'llalsolearnhowtouseDeepQ-NetworkstocompleteAtarigames,alongwithhowtoeffectivelyimplementpolicygradients.Finally,you'lldiscoverhowRLtechniquesareappliedtoBlackjack,Gridworldenvironments,internetadvertising,andtheFlappyBirdgame.Bytheendofthisbook,you'llhavedevelopedtheskillsyouneedtoimplementpopularRLalgorithmsanduseRLtechniquestosolvereal-worldproblems.
目錄(273章)
倒序
- coverpage
- Title Page
- Copyright and Credits
- PyTorch 1.x Reinforcement Learning Cookbook
- About Packt
- Why subscribe?
- Contributors
- About the author
- About the reviewers
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Sections
- Getting ready
- How to do it…
- How it works…
- There's more…
- See also
- Get in touch
- Reviews
- Getting Started with Reinforcement Learning and PyTorch
- Setting up the working environment
- How to do it...
- How it works...
- There's more...
- See also
- Installing OpenAI Gym
- How to do it...
- How it works...
- There's more...
- See also
- Simulating Atari environments
- How to do it...
- How it works...
- There's more...
- See also
- Simulating the CartPole environment
- How to do it...
- How it works...
- There's more...
- Reviewing the fundamentals of PyTorch
- How to do it...
- There's more...
- See also
- Implementing and evaluating a random search policy
- How to do it...
- How it works...
- There's more...
- Developing the hill-climbing algorithm
- How to do it...
- How it works...
- There's more...
- See also
- Developing a policy gradient algorithm
- How to do it...
- How it works...
- There's more...
- See also
- Markov Decision Processes and Dynamic Programming
- Technical requirements
- Creating a Markov chain
- How to do it...
- How it works...
- There's more...
- See also
- Creating an MDP
- How to do it...
- How it works...
- There's more...
- See also
- Performing policy evaluation
- How to do it...
- How it works...
- There's more...
- Simulating the FrozenLake environment
- Getting ready
- How to do it...
- How it works...
- There's more...
- Solving an MDP with a value iteration algorithm
- How to do it...
- How it works...
- There's more...
- Solving an MDP with a policy iteration algorithm
- How to do it...
- How it works...
- There's more...
- See also
- Solving the coin-flipping gamble problem
- How to do it...
- How it works...
- There's more...
- Monte Carlo Methods for Making Numerical Estimations
- Calculating Pi using the Monte Carlo method
- How to do it...
- How it works...
- There's more...
- See also
- Performing Monte Carlo policy evaluation
- How to do it...
- How it works...
- There's more...
- Playing Blackjack with Monte Carlo prediction
- How to do it...
- How it works...
- There's more...
- See also
- Performing on-policy Monte Carlo control
- How to do it...
- How it works...
- There's more...
- Developing MC control with epsilon-greedy policy
- How to do it...
- How it works...
- Performing off-policy Monte Carlo control
- How to do it...
- How it works...
- There's more...
- See also
- Developing MC control with weighted importance sampling
- How to do it...
- How it works...
- There's more...
- See also
- Temporal Difference and Q-Learning
- Setting up the Cliff Walking environment playground
- Getting ready
- How to do it...
- How it works...
- Developing the Q-learning algorithm
- How to do it...
- How it works...
- There's more...
- Setting up the Windy Gridworld environment playground
- How to do it...
- How it works...
- Developing the SARSA algorithm
- How to do it...
- How it works...
- There's more...
- Solving the Taxi problem with Q-learning
- Getting ready
- How to do it...
- How it works...
- Solving the Taxi problem with SARSA
- How to do it...
- How it works...
- There's more...
- Developing the Double Q-learning algorithm
- How to do it...
- How it works...
- See also
- Solving Multi-armed Bandit Problems
- Creating a multi-armed bandit environment
- How to do it...
- How it works...
- Solving multi-armed bandit problems with the epsilon-greedy policy
- How to do it...
- How it works...
- There's more...
- Solving multi-armed bandit problems with the softmax exploration
- How to do it...
- How it works...
- Solving multi-armed bandit problems with the upper confidence bound algorithm
- How to do it...
- How it works...
- There's more...
- See also
- Solving internet advertising problems with a multi-armed bandit
- How to do it...
- How it works...
- Solving multi-armed bandit problems with the Thompson sampling algorithm
- How to do it...
- How it works...
- See also
- Solving internet advertising problems with contextual bandits
- How to do it...
- How it works...
- Scaling Up Learning with Function Approximation
- Setting up the Mountain Car environment playground
- Getting ready
- How to do it...
- How it works...
- Estimating Q-functions with gradient descent approximation
- How to do it...
- How it works...
- See also
- Developing Q-learning with linear function approximation
- How to do it...
- How it works...
- Developing SARSA with linear function approximation
- How to do it...
- How it works...
- Incorporating batching using experience replay
- How to do it...
- How it works...
- Developing Q-learning with neural network function approximation
- How to do it...
- How it works...
- See also
- Solving the CartPole problem with function approximation
- How to do it...
- How it works...
- Deep Q-Networks in Action
- Developing deep Q-networks
- How to do it...
- How it works...
- See also
- Improving DQNs with experience replay
- How to do it...
- How it works...
- Developing double deep Q-Networks
- How to do it...
- How it works...
- Tuning double DQN hyperparameters for CartPole
- How to do it...
- How it works...
- Developing Dueling deep Q-Networks
- How to do it...
- How it works...
- Applying Deep Q-Networks to Atari games
- How to do it...
- How it works...
- Using convolutional neural networks for Atari games
- How to do it...
- How it works...
- See also
- Implementing Policy Gradients and Policy Optimization
- Implementing the REINFORCE algorithm
- How to do it...
- How it works...
- See also
- Developing the REINFORCE algorithm with baseline
- How to do it...
- How it works...
- Implementing the actor-critic algorithm
- How to do it...
- How it works...
- Solving Cliff Walking with the actor-critic algorithm
- How to do it...
- How it works...
- Setting up the continuous Mountain Car environment
- How to do it...
- How it works...
- Solving the continuous Mountain Car environment with the advantage actor-critic network
- How to do it...
- How it works...
- There's more...
- See also
- Playing CartPole through the cross-entropy method
- How to do it...
- How it works...
- Capstone Project – Playing Flappy Bird with DQN
- Setting up the game environment
- Getting ready
- How to do it...
- How it works...
- Building a Deep Q-Network to play Flappy Bird
- How to do it...
- How it works...
- Training and tuning the network
- How to do it...
- How it works...
- Deploying the model and playing the game
- How to do it...
- How it works...
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-24 12:35:24
推薦閱讀
- 平面設計初步
- Dreamweaver 8中文版商業案例精粹
- 精通特征工程
- 工業機器人維護與保養
- SAP Business Intelligence Quick Start Guide
- 單片機技術項目化原理與實訓
- SQL Server數據庫應用基礎(第2版)
- Excel 2007終極技巧金典
- 和機器人一起進化
- TensorFlow Deep Learning Projects
- 運動控制系統(第2版)
- 計算機應用基礎實訓(職業模塊)
- Creating ELearning Games with Unity
- 數字中國:大數據與政府管理決策
- 微機原理及接口技術
- Getting Started with Kubernetes
- ARM Cortex-M3微控制器原理與應用
- Getting Started with LevelDB
- 后期合成
- 微機原理與接口技術(基于32位機)
- RIA開發權威指南
- 圖解傳感器與儀表應用(第2版)
- 51單片機C語言應用開發三位一體實戰精講
- TensorFlow:Powerful Predictive Analytics with TensorFlow
- Modern Computer Architecture and Organization
- CentOS System Administration Essentials
- Practical Machine Learning Cookbook
- 傳感器原理及應用技術
- 人人可懂的數據科學
- Flex 3開發實踐