官术网_书友最值得收藏!

Your Q-learning agent in its environment

Let's talk about the self-driving taxi agent that we'll be building. Recall that the Taxi-v2 environment has 500 states, and 6 possible actions that can be taken from each state.

Your objective in the taxi environment is to pick up a passenger at one location, and drop them off at their desired destination in as few timesteps as possible.

You receive points for a successful drop-off, and lose points for the time it takes to complete the task, so your goal is to complete the task in as little time as possible. You also lose points for incorrect actions, such as dropping a passenger off at the wrong location.

Because your goal is to get to both the pickup and drop-off locations as quickly as possible, you lose one point for every move you make per timestep.

Your agent's goal in solving this problem is to find the optimal policy for getting the passenger to their destination as efficiently as possible, netting the maximum reward for itself. While it navigates the environment, it will learn the best action to take from each state, which will serve as its policy function.

Remember that because Q-learning is value-based and not policy-based, it will not take your agent's actual policy into account, and we will not explicitly enumerate this policy. Instead, the Q-learning algorithm will calculate the value for each state-action pair based on the highest possible value of the next action that your agent could take, therefore assuming that your agent is already following the optimal policy.

We will continue to explore this concept in more detail with the functions that you will write for your agent. The OpenAI Gym package that we will use will provide the game environment, and you will implement the Q-learning algorithm yourself. You can then use the same environment to implement other RL algorithms and compare their performance.

主站蜘蛛池模板: 海门市| 涟源市| 永和县| 喀喇沁旗| 抚远县| 石棉县| 喜德县| 长葛市| 山阳县| 台安县| 和林格尔县| 迭部县| 高陵县| 牡丹江市| 三台县| 乌鲁木齐县| 惠东县| 曲水县| 西城区| 沙河市| 江阴市| 五大连池市| 阳春市| 延津县| 建始县| 墨江| 迁西县| 富锦市| 正镶白旗| 乌鲁木齐市| 南开区| 昆明市| 兴国县| 化德县| 漾濞| 大方县| 仁怀市| 清新县| 勃利县| 马龙县| 巴南区|