- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 245字
- 2021-06-24 12:34:46
Simulating the FrozenLake environment
The optimal policies for the MDPs we have dealt with so far are pretty intuitive. However, it won't be that straightforward in most cases, such as the FrozenLake environment. In this recipe, let's play around with the FrozenLake environment and get ready for upcoming recipes where we will find its optimal policy.
FrozenLake is a typical Gym environment with a discrete state space. It is about moving an agent from the starting location to the goal location in a grid world, and at the same time avoiding traps. The grid is either four by four (https://gym.openai.com/envs/FrozenLake-v0/) or eight by eigh.
t (https://gym.openai.com/envs/FrozenLake8x8-v0/). The grid is made up of the following four types of tiles:
- S: The starting location
- G: The goal location, which terminates an episode
- F: The frozen tile, which is a walkable location
- H: The hole location, which terminates an episode
There are four actions, obviously: moving left (0), moving down (1), moving right (2), and moving up (3). The reward is +1 if the agent successfully reaches the goal location, and 0 otherwise. Also, the observation space is represented in a 16-dimensional integer array, and there are 4 possible actions (which makes sense).
What is tricky in this environment is that, as the ice surface is slippery, the agent won't always move in the direction it intends. For example, it may move to the left or to the right when it intends to move down.
- 基于C語(yǔ)言的程序設(shè)計(jì)
- Div+CSS 3.0網(wǎng)頁(yè)布局案例精粹
- 機(jī)器學(xué)習(xí)與大數(shù)據(jù)技術(shù)
- 21天學(xué)通Java Web開(kāi)發(fā)
- Implementing Oracle API Platform Cloud Service
- 樂(lè)高機(jī)器人—槍械武器庫(kù)
- Kubernetes for Developers
- OpenStack Cloud Computing Cookbook
- MCGS嵌入版組態(tài)軟件應(yīng)用教程
- 深度學(xué)習(xí)與目標(biāo)檢測(cè)
- AVR單片機(jī)工程師是怎樣煉成的
- 手把手教你學(xué)Photoshop CS3
- TensorFlow 2.0卷積神經(jīng)網(wǎng)絡(luò)實(shí)戰(zhàn)
- Cloud Native Development Patterns and Best Practices
- 深度學(xué)習(xí)500問(wèn):AI工程師面試寶典