- Hands-On Q-Learning with Python
- Nazia Habib
- 408字
- 2021-06-24 15:13:09
States
Whatever we need to know about our environment is stored as part of our state, which can be represented as a vector of the variables that we care about:
- The location (x and y coordinates)
- The direction
- The color of light (red or green)
- The other cars present (for example, one binary flag for each spot a car might be in)
- The distance from the destination
The following screenshot is from the game Pac-Man:

Taking Pac-Man as another example, we can use a state vector to represent the variables that we want to keep track of—such as the location of the dots left in the maze, where the Pac-Man character currently is and what direction it is moving in, the location and direction of each ghost, and whether the ghosts can be eaten or not.
We can represent any variables in our state vector that we think are important to our knowledge of the game. At any point in time, our state vector should represent for us the things that we want to know about our environment.
Ideally, we should be able to look at our state vector and have all the information we need to optimally determine what action we need to take. A well-designed state space is key to an effective RL solution.
However, we can quickly see that the number of states in an environment depends on the variables that we choose to keep track of. In other words, it is arbitrary to some respect. Not all algorithm designers will represent the same environment using the same state space. One thing we notice (as developers and researchers) is that even a small change in the way state spaces are represented in an environment can cause a huge difference in the difficulty level of a problem.
When we use a standardized packaged environment such as the ones we'll be working with in OpenAI Gym, the state space (also called an observation space) will be determined for us. We'll also have a predetermined action space and reward structure.
One good reason to use a standardized environment such as the one offered by OpenAI Gym is that it allows you to compare the performance of your RL algorithms to the work of others. Having a level playing field for the state space allows us to meaningfully compare RL algorithms to each other in a way we otherwise could not.
- Cinema 4D R13 Cookbook
- Hands-On Machine Learning on Google Cloud Platform
- Photoshop CS4經典380例
- Java開發技術全程指南
- 離散事件系統建模與仿真
- 讓每張照片都成為佳作的Photoshop后期技法
- 分布式多媒體計算機系統
- Windows內核原理與實現
- INSTANT Autodesk Revit 2013 Customization with .NET How-to
- Excel 2007常見技法與行業應用實例精講
- Spatial Analytics with ArcGIS
- 基于Proteus的單片機應用技術
- 3ds Max造型表現藝術
- 大數據導論
- 玩機器人 學單片機