- Applied Unsupervised Learning with Python
- Benjamin Johnston Aaron Jones Christopher Kruger
- 286字
- 2021-06-11 13:23:57
Summary
In this chapter, we discussed how hierarchical clustering works and where it may be best employed. In particular, we discussed various aspects of how clusters can be subjectively chosen through the evaluation of a dendrogram plot. This is a huge advantage compared to k-means clustering if you have absolutely no idea of what you're looking for in the data. Two key parameters that drive the success of hierarchical clustering were also discussed: the agglomerative versus divisive approach and linkage criteria. Agglomerative clustering takes a bottom-up approach by recursively grouping nearby data together until it results in one large cluster. Divisive clustering takes a top-down approach by starting with the one large cluster and recursively breaking it down until each data point falls into its own cluster. Divisive clustering has the potential to be more accurate since it has a complete view of the data from the start; however, it adds a layer of complexity that can decrease the stability and increase the runtime.
Linkage criteria grapples with the concept of how distance is calculated between candidate clusters. We have explored how centroids can make an appearance again beyond k-means clustering, as well as single and complete linkage criteria. Single linkage finds cluster distances by comparing the closest points in each cluster, while complete linkage finds cluster distances by comparing more distant points in each cluster. From the understanding that you have gained in this chapter, you are now able to evaluate how both k-means and hierarchical clustering can best fit the challenge that you are working on. In the next chapter, we will cover a clustering approach that will serve us best in the highly complex data: DBSCAN (Density-Based Spatial Clustering of Applications with Noise).
- 極簡算法史:從數學到機器的故事
- Arduino by Example
- PHP程序設計(慕課版)
- CentOS 7 Server Deployment Cookbook
- Vue.js快速入門與深入實戰
- RTC程序設計:實時音視頻權威指南
- C語言程序設計
- JS全書:JavaScript Web前端開發指南
- Kotlin Standard Library Cookbook
- PostgreSQL Replication(Second Edition)
- Linux操作系統基礎案例教程
- Django實戰:Python Web典型模塊與項目開發
- 從Excel到Python數據分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應用
- 測試架構師修煉之道:從測試工程師到測試架構師
- Python機器學習與量化投資