官术网_书友最值得收藏!

Clustering

Typically, when people talk about unsupervised learning, they talk about cluster analysis or clustering. A cluster analysis algorithm takes a set of data points and tries to categorize them into groups such that similar items belong to the same group, and different items do not. There are many ways where it can be used, for example, in customer segmentation or text categorization.

Customer segmentation is an example of clustering. Given some description of customers, we try to put them into groups such that the customers in one group have similar profiles and behave in a similar way. This information can be used to understand what do the people in these groups want, and this can be used to target them with better advertisements and other promotional messages.

Another example is text categorization. Given a collection of texts, we would like to find common topics among these texts and arrange the texts according to these topics. For example, given a set of complaints in an e-commerce store, we may want to put ones that talk about similar things together, and this should help the users of the system navigate through the complaints easier.

Examples of cluster analysis algorithms are hierarchical clustering, k-means, density-based spatial clustering of applications with noise (DBSCAN), and many others. We will talk about clustering in detail in the first part of Chapter 5, Unsupervised Learning - Clustering and Dimensionality Reduction.

主站蜘蛛池模板: 淄博市| 胶南市| 绿春县| 台江县| 温州市| 陆川县| 拜城县| 抚顺县| 监利县| 凤山县| 中方县| 临高县| 安康市| 日喀则市| 保定市| 湾仔区| 会泽县| 永善县| 江北区| 修水县| 定兴县| 南郑县| 边坝县| 南岸区| 潮安县| 民权县| 常山县| 灌云县| 涿鹿县| 天水市| 黔西县| 共和县| 江山市| 通江县| 罗平县| 松原市| 巴彦淖尔市| 从化市| 高州市| 隆昌县| 通江县|