官术网_书友最值得收藏!

Cluster analysis

Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.

The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:

  • k-means
  • k-medoids
  • Hierarchical clustering
  • GMM
  • HMM

When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.

A typical cluster analysis result is shown in the following figure:

Figure 1.19: A cluster analysis example

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.

主站蜘蛛池模板: 通化县| 白山市| 博乐市| 东港市| 平顺县| 若羌县| 南部县| 乐昌市| 收藏| 包头市| 托克托县| 峡江县| 安新县| 衢州市| 桦甸市| 城市| 克拉玛依市| 平顺县| 天全县| 靖江市| 旌德县| 临桂县| 宕昌县| 双流县| 和平区| 文化| 平山县| 滦平县| 喀喇| 麻江县| 思茅市| 毕节市| 施秉县| 奉新县| 化德县| 横山县| 蕉岭县| 弋阳县| 绩溪县| 萨迦县| 宿迁市|