官术网_书友最值得收藏!

  • Machine Learning in Java
  • AshishSingh Bhatia Bostjan Kaluza
  • 217字
  • 2021-06-10 19:29:59

Clustering

Clustering is a technique for grouping similar instances into clusters according to some distance measures. The main idea is to put instances that are similar (that is, close to each other) into the same cluster, while keeping the dissimilar points (that is, the ones further apart from each other) in different clusters. An example of how clusters might look like is shown in the following diagram:

The clustering algorithms follow two fundamentally different approaches. The first is a hierarchical or agglomerative approach that first considers each point as its own cluster, and then iteratively merges the most similar clusters together. It stops when further merging reaches a predefined number of clusters, or if the clusters to be merged are spread over a large region.

The other approach is based on point assignment. First, initial cluster centers (that is, centroids) are estimated, for instance, randomly, and then, each point is assigned to the closest cluster, until all of the points are assigned. The most well known algorithm in this group is k-means clustering.

The k-means clustering either picks initial cluster centers as points that are as far as possible from one another, or (hierarchically) clusters a sample of data and picks a point that is the closest to the center of each of the k-clusters.

主站蜘蛛池模板: 阳山县| 大竹县| 柳州市| 夏津县| 夏津县| 洛阳市| 汤阴县| 松原市| 霍林郭勒市| 石河子市| 高陵县| 青海省| 绵阳市| 东平县| 上虞市| 察隅县| 营山县| 松滋市| 布尔津县| 红安县| 襄汾县| 久治县| 大安市| 通城县| 台湾省| 乌兰浩特市| 泾阳县| 山阴县| 阳信县| 盖州市| 西乌珠穆沁旗| 察隅县| 文化| 兴仁县| 岢岚县| 桂阳县| 宜兰市| 专栏| 曲周县| 庄河市| 寿宁县|