官术网_书友最值得收藏!

Alpha – deterministic versus stochastic environments

Your agent's learning rate alpha ranges from zero to one. Setting the learning rate to zero will cause your agent to learn nothing. All of its exploration of its environment and the rewards it receives will not affect its behavior at all, and it will continue to behave completely randomly.

Setting the learning rate to one will cause your agent to learn policies that are fully specific to a deterministic environment. One important distinction to understand is between deterministic and stochastic environments and policies.

Briefly, in a deterministic environment, the output is totally determined by the initial conditions and there is no randomness involved. We always take the same action from the same state in a deterministic environment.

In a stochastic environment, there is randomness involved and the decisions that we make are given as probability distributions. In other words, we don't always take the same action from the same state. 

主站蜘蛛池模板: 镇坪县| 崇州市| 宣城市| 正阳县| 慈利县| 朝阳县| 吉木萨尔县| 平度市| 新巴尔虎右旗| 时尚| 临沭县| 英吉沙县| 上虞市| 美姑县| 永和县| 平舆县| 得荣县| 唐海县| 靖安县| 信宜市| 板桥市| 武清区| 新民市| 称多县| 边坝县| 海晏县| 巴青县| 石阡县| 揭东县| 洱源县| 峡江县| 伊川县| 崇义县| 辽中县| 丹巴县| 开远市| 四会市| 阿荣旗| 南郑县| 扶沟县| 万荣县|