官术网_书友最值得收藏!

Why do we use xavier initialization?

The following factors call for the application of xavier initialization:

  • If the weights in a network start very small, most of the signals will shrink and become dormant at the activation function in the later layers

  • If the weights start very large, most of the signals will massively grow and pass through the activation functions in the later layers

Thus, xavier initialization helps in generating optimal weights, such that the signals are within optimal range, thereby minimizing the chances of the signals getting neither too small nor too large.

The derivation of the preceding formula is beyond the scope of this book. Feel free to search here (http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization) and go through the derivation for a better understanding.

主站蜘蛛池模板: 安吉县| 沭阳县| 峨山| 巨鹿县| 昭通市| 西乌| 临潭县| 石泉县| 通江县| 邵东县| 四子王旗| 南康市| 水城县| 临湘市| 肃南| 珠海市| 瑞丽市| 南木林县| 雷波县| 天长市| 隆昌县| 封丘县| 肇东市| 开阳县| 玉屏| 定南县| 正阳县| 上犹县| 临漳县| 广宁县| 大埔县| 四平市| 武功县| 新平| 成安县| 威海市| 新余市| 威海市| 荣成市| 鄂尔多斯市| 三都|