官术网_书友最值得收藏!

KNN cons

  • The algorithm is fast for training but slow for inference.
  • You need to choose the best k somehow (see Choosing a good k section).
  • With the small values of k, the model can be badly affected by outliers; in other words, it's prone to overfitting.
  • You need to choose a distance metric. For usual real value features, one can choose among many available options (see Calculating the distance section) resulting in different closest neighbors. The metric used by default in many machine learning packages is the Euclidean distance; however, this choice is nothing more than a tradition and for many applications is not the optimal.
  • Model size grows with the new data incorporated.
  • What should we do if there are several identical samples with different labels? In this case, the result can be different depending on the order in which samples are stored.
  • The model suffers from the curse of dimensionality.
主站蜘蛛池模板: 平度市| 鄂托克旗| 淮滨县| 赤壁市| 台安县| 新昌县| 沙湾县| 太谷县| 应城市| 红河县| 唐山市| 沅江市| 乌苏市| 英超| 石楼县| 大安市| 西平县| 新乡县| 镇平县| 义马市| 榆中县| 开化县| 张家港市| 永吉县| 景宁| 望谟县| 绵竹市| 天门市| 松江区| 潞城市| 陇川县| 富锦市| 沧源| 开江县| 荥阳市| 乐陵市| 阿拉善左旗| 黑河市| 蓝山县| 独山县| 和龙市|