- Machine Learning with Swift
- Alexander Sosnovshchenko
- 231字
- 2021-06-24 18:55:05
Choosing a good k
It is important to pick a proper value of hyperparameter k, since it can improve a model's performance as well as degrade it when chosen incorrectly. One popular rule of thumb is to take a square root of the number of training samples. Many popular software packages use this heuristic as a default k value. Unfortunately, this doesn't always work well, because of the differences in the data and distance metrics.
There is no mathematically-grounded way to come up with the optimal number of neighbors from the very beginning. The only option is to scan through a range of ks, and choose the best one according to some performance metric. You can use any performance metric that we've already described in the previous chapter: accuracy, F1, and so on. The cross-validation is especially useful when the data is scarce.
In fact, there is a variation of KNN, which doesn't require k at all. The idea is to make the algorithm take the radius of a ball to search the neighbors within. The k will be different for each point then, depending on the local density of points. This variation of the algorithm is known as radius-based neighbor learning. It suffers from the n-ball volume problem (see next section), because the more features you have, the bigger the radius should be to catch at least one neighbor.
- ATmega16單片機項目驅動教程
- Cortex-M3 + μC/OS-II嵌入式系統開發入門與應用
- 辦公通信設備維修
- BeagleBone By Example
- The Applied AI and Natural Language Processing Workshop
- 3ds Max Speed Modeling for 3D Artists
- Artificial Intelligence Business:How you can profit from AI
- 筆記本電腦維修不是事兒(第2版)
- VCD、DVD原理與維修
- 筆記本電腦使用、維護與故障排除從入門到精通(第5版)
- 單片機原理及應用:基于C51+Proteus仿真
- 嵌入式系統原理及應用:基于ARM Cortex-M4體系結構
- FPGA實驗實訓教程
- Building Machine Learning Systems with Python
- 微服務實戰(Dubbox +Spring Boot+Docker)