官术网_书友最值得收藏!

Cluster analysis

Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.

The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:

  • k-means
  • k-medoids
  • Hierarchical clustering
  • GMM
  • HMM

When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.

A typical cluster analysis result is shown in the following figure:

Figure 1.19: A cluster analysis example

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.

主站蜘蛛池模板: 宁夏| 武鸣县| 房产| 从江县| 绩溪县| 浠水县| 留坝县| 疏勒县| 桓仁| 纳雍县| 府谷县| 修文县| 射洪县| 大荔县| 通海县| 安阳市| 和龙市| 大宁县| 滦南县| 千阳县| 永修县| 出国| 遂溪县| 托克托县| 常州市| 霍城县| 安吉县| 长顺县| 长兴县| 荥阳市| 镇坪县| 南皮县| 常熟市| 肇源县| 衡南县| 罗平县| 措美县| 张家口市| 榆树市| 佛坪县| 兰州市|