官术网_书友最值得收藏!

Using instance-based models for classification and clustering

Instance-based machine learning algorithms are usually easy to understand as they have some geometrical intuition behind them. They can be used to perform different kinds of tasks, including classification, regression, clustering, and anomaly detection.

It's easy to confuse classification and clustering at first. Just to remind you, classification is one of the many types of supervised learning. The task is to predict some discrete label from the set of features (Figure 3.4, left pane). Technically, classification goes in two types: binary (check yes or no), and multiclass (yes/no/maybe/I don't know/can you repeat the question?). But in practice, you can always build a multiclass classifier from several binary classifiers.

On the other hand, clustering is the task of unsupervised learning. This means that, unlike classification, it knows nothing about data labels, and works out clusters of similar samples in your data on its own. In the next chapter, we are going to discuss an instance-based clustering algorithm called k-means (KNN), and in this chapter, we focus on applications of instance-based algorithm KNN to multiclass classification:

  
Figure 3.4: Classification process (on the left) and clustering (on the right). Classification consists of two steps: training with the labelled data and inference with unlabeled data. Clustering groups samples according to their similarity.
主站蜘蛛池模板: 丰原市| 宾川县| 德化县| 广德县| 常宁市| 清新县| 大新县| 昭觉县| 溆浦县| 星子县| 涿鹿县| 分宜县| 若羌县| 湄潭县| 胶州市| 阿拉善右旗| 榆社县| 娄底市| 元江| 枞阳县| 布拖县| 民勤县| 依安县| 宜宾市| 大丰市| 修文县| 巨野县| 鲁甸县| 乐陵市| 修武县| 蓝山县| 封开县| 稻城县| 满洲里市| 武胜县| 龙山县| 阿鲁科尔沁旗| 济南市| 余江县| 梓潼县| 吉隆县|