官术网_书友最值得收藏!

Alpha – deterministic versus stochastic environments

Your agent's learning rate alpha ranges from zero to one. Setting the learning rate to zero will cause your agent to learn nothing. All of its exploration of its environment and the rewards it receives will not affect its behavior at all, and it will continue to behave completely randomly.

Setting the learning rate to one will cause your agent to learn policies that are fully specific to a deterministic environment. One important distinction to understand is between deterministic and stochastic environments and policies.

Briefly, in a deterministic environment, the output is totally determined by the initial conditions and there is no randomness involved. We always take the same action from the same state in a deterministic environment.

In a stochastic environment, there is randomness involved and the decisions that we make are given as probability distributions. In other words, we don't always take the same action from the same state. 

主站蜘蛛池模板: 缙云县| 青田县| 桐庐县| 宁河县| 秭归县| 贵阳市| 辰溪县| 阳春市| 新田县| 申扎县| 武功县| 体育| 墨江| 淮南市| 墨玉县| 长治县| 扶沟县| 上思县| 垦利县| 海林市| 彩票| 凤冈县| 泗阳县| 丰台区| 贵港市| 吉安市| 永济市| 龙陵县| 隆回县| 灌南县| 临沧市| 得荣县| 长岛县| 宜兴市| 三原县| 阜南县| 新密市| 克什克腾旗| 泾川县| 含山县| 凭祥市|