官术网_书友最值得收藏!

What happens if we use too many neurons?

If we make our network architecture too complicated, two things will happen:

  • We're likely to develop a high variance model
  • The model will train slower than a less complicated model

If we add many layers, our gradients will get smaller and smaller until the first few layers barely train, which is called the vanishing gradient problem. We're nowhere near that yet, but we will talk about it later.

In (almost) the words of rap legend Christopher Wallace, aka Notorious B.I.G., the more neurons we come across, the more problems we see. With that said, the variance can be managed with dropout, regularization, and early stopping, and advances in GPU computing make deeper networks possible.

If I had to pick between a network with too many neurons or too few, and I only got to try one experiment, I'd prefer to err on the side of slightly too many.  

主站蜘蛛池模板: 昌平区| 尼玛县| 民权县| 娄烦县| 西藏| 澎湖县| 托里县| 沙河市| 罗江县| 新野县| 呼和浩特市| 安福县| 沙湾县| 特克斯县| 青岛市| 巴中市| 富阳市| 特克斯县| 盖州市| 临清市| 微山县| 常德市| 元谋县| 嘉义县| 乌拉特后旗| 楚雄市| 阜平县| 宁海县| 雷山县| 安阳县| 波密县| 巨野县| 泾川县| 淮滨县| 深州市| 乐业县| 太湖县| 威远县| 会理县| 五原县| 如皋市|