官术网_书友最值得收藏!

Cluster analysis

Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.

The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:

  • k-means
  • k-medoids
  • Hierarchical clustering
  • GMM
  • HMM

When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.

A typical cluster analysis result is shown in the following figure:

Figure 1.19: A cluster analysis example

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.

主站蜘蛛池模板: 治多县| 延长县| 潜江市| 宾阳县| 舒城县| 繁昌县| 湾仔区| 通山县| 乌鲁木齐县| 筠连县| 瑞安市| 宁陕县| 金山区| 德庆县| 张家港市| 临夏县| 洛浦县| 连州市| 正宁县| 二手房| 浠水县| 方正县| 康定县| 葵青区| 格尔木市| 遂昌县| 柘城县| 化州市| 莱芜市| 弋阳县| 吉安县| 库尔勒市| 汽车| 新巴尔虎右旗| 湖南省| 郁南县| 邵阳市| 沭阳县| 沧州市| 商南县| 湖南省|