- Mastering Java for Data Science
- Alexey Grigorev
- 234字
- 2021-07-02 23:44:31
Clustering
Typically, when people talk about unsupervised learning, they talk about cluster analysis or clustering. A cluster analysis algorithm takes a set of data points and tries to categorize them into groups such that similar items belong to the same group, and different items do not. There are many ways where it can be used, for example, in customer segmentation or text categorization.
Customer segmentation is an example of clustering. Given some description of customers, we try to put them into groups such that the customers in one group have similar profiles and behave in a similar way. This information can be used to understand what do the people in these groups want, and this can be used to target them with better advertisements and other promotional messages.
Another example is text categorization. Given a collection of texts, we would like to find common topics among these texts and arrange the texts according to these topics. For example, given a set of complaints in an e-commerce store, we may want to put ones that talk about similar things together, and this should help the users of the system navigate through the complaints easier.
Examples of cluster analysis algorithms are hierarchical clustering, k-means, density-based spatial clustering of applications with noise (DBSCAN), and many others. We will talk about clustering in detail in the first part of Chapter 5, Unsupervised Learning - Clustering and Dimensionality Reduction.
- GitHub Essentials
- LibGDX Game Development Essentials
- 在你身邊為你設計Ⅲ:騰訊服務設計思維與實戰
- Java Data Science Cookbook
- PySpark大數據分析與應用
- Python數據分析:基于Plotly的動態可視化繪圖
- Sybase數據庫在UNIX、Windows上的實施和管理
- 大數據Hadoop 3.X分布式處理實戰
- 數據驅動設計:A/B測試提升用戶體驗
- MySQL 8.x從入門到精通(視頻教學版)
- Python金融數據分析(原書第2版)
- 高維數據分析預處理技術
- 智慧城市中的大數據分析技術
- 從Lucene到Elasticsearch:全文檢索實戰
- Python金融數據挖掘與分析實戰