書名： TensorFlow Reinforcement Learning Quick Start Guide
作者名： Kaushik Balakrishnan
本章字數： 241字
更新時間： 2021-06-24 15:29:10

Cliff walking and grid world problems

Let's consider cliff walking and grid world problems. First, we will introduce these problems to you, then we will proceed on to the coding part. For both problems, we consider a rectangular grid with nrows (number of rows) and ncols (number of columns). We start from one cell to the south of the bottom left cell, and the goal is to reach the destination, which is one cell to the south of the bottom right cell.

Note that the start and destination cells are not part of the nrows x ncols grid of cells. For the cliff walking problem, the cells to the south of the bottom row of cells, except for the start and destination cells, form a cliff where, if the agent enters, the episode ends with catastrophic fall into the cliff. Likewise, if the agent tries to leave the left, top, or right boundaries of the grid of cells, it is placed back in the same cell, that is, it is equivalent to taking no action.

For the grid world problem, we do not have a cliff, but we have obstacles inside the grid world. If the agent tries to enter any of these obstacle cells, it is bounced back to the same cell from which it came. In both these problems, the goal is to find the optimum path from the start to the destination.

So, let's dive on in!

官术网_书友最值得收藏!

TensorFlow Reinforcement Learning Quick Start Guide

Cliff walking and grid world problems