官术网_书友最值得收藏!

Your Q-learning agent in its environment

Let's talk about the self-driving taxi agent that we'll be building. Recall that the Taxi-v2 environment has 500 states, and 6 possible actions that can be taken from each state.

Your objective in the taxi environment is to pick up a passenger at one location, and drop them off at their desired destination in as few timesteps as possible.

You receive points for a successful drop-off, and lose points for the time it takes to complete the task, so your goal is to complete the task in as little time as possible. You also lose points for incorrect actions, such as dropping a passenger off at the wrong location.

Because your goal is to get to both the pickup and drop-off locations as quickly as possible, you lose one point for every move you make per timestep.

Your agent's goal in solving this problem is to find the optimal policy for getting the passenger to their destination as efficiently as possible, netting the maximum reward for itself. While it navigates the environment, it will learn the best action to take from each state, which will serve as its policy function.

Remember that because Q-learning is value-based and not policy-based, it will not take your agent's actual policy into account, and we will not explicitly enumerate this policy. Instead, the Q-learning algorithm will calculate the value for each state-action pair based on the highest possible value of the next action that your agent could take, therefore assuming that your agent is already following the optimal policy.

We will continue to explore this concept in more detail with the functions that you will write for your agent. The OpenAI Gym package that we will use will provide the game environment, and you will implement the Q-learning algorithm yourself. You can then use the same environment to implement other RL algorithms and compare their performance.

主站蜘蛛池模板: 海安县| 彰武县| 延安市| 临安市| 新蔡县| 江阴市| 万全县| 宣化县| 饶阳县| 保山市| 昭苏县| 丰原市| 马关县| 宜川县| 济源市| 股票| 青铜峡市| 阳江市| 龙南县| 沁水县| 新源县| 西藏| 雷山县| 贵定县| 海盐县| 江川县| 天气| 东阿县| 富阳市| 佛冈县| 龙里县| 息烽县| 兴海县| 伊通| 施秉县| 西青区| 江达县| 芷江| 沙湾县| 清水河县| 黎城县|