官术网_书友最值得收藏!

Clustering

Typically, when people talk about unsupervised learning, they talk about cluster analysis or clustering. A cluster analysis algorithm takes a set of data points and tries to categorize them into groups such that similar items belong to the same group, and different items do not. There are many ways where it can be used, for example, in customer segmentation or text categorization.

Customer segmentation is an example of clustering. Given some description of customers, we try to put them into groups such that the customers in one group have similar profiles and behave in a similar way. This information can be used to understand what do the people in these groups want, and this can be used to target them with better advertisements and other promotional messages.

Another example is text categorization. Given a collection of texts, we would like to find common topics among these texts and arrange the texts according to these topics. For example, given a set of complaints in an e-commerce store, we may want to put ones that talk about similar things together, and this should help the users of the system navigate through the complaints easier.

Examples of cluster analysis algorithms are hierarchical clustering, k-means, density-based spatial clustering of applications with noise (DBSCAN), and many others. We will talk about clustering in detail in the first part of Chapter 5, Unsupervised Learning - Clustering and Dimensionality Reduction.

主站蜘蛛池模板: 彰化市| 莱阳市| 临漳县| 祁门县| 历史| 湛江市| 鄂尔多斯市| 无锡市| 安康市| 中西区| 永州市| 高清| 松阳县| 广东省| 邢台市| 广灵县| 金湖县| 新民市| 杨浦区| 大渡口区| 龙井市| 万宁市| 岑巩县| 石渠县| 杂多县| 桦甸市| 宝清县| 青海省| 甘泉县| 曲麻莱县| 蓝田县| 九龙县| 昌邑市| 双江| 美姑县| 瑞昌市| 合肥市| 灵寿县| 长寿区| 汤阴县| 徐州市|