官术网_书友最值得收藏!

An overview of clustering

Clustering is a division of data into groups of similar objects. Each object (cluster) consists of objects that are similar between themselves and dissimilar to objects of other groups. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Clustering can be used in varied areas of application from data mining (DNA analysis, marketing studies, insurance studies, and so on.), text mining, information retrieval, statistical computational linguists, and corpus-based computational lexicography. Some of the requirements that must be fulfilled by clustering algorithms are as follows:

  • Scalability
  • Dealing with various types of attributes
  • Discovering clusters of arbitrary shapes
  • The ability to deal with noise and outliers
  • Interpretability and usability

The following diagram shows a representation of clustering:

主站蜘蛛池模板: 芦山县| 抚远县| 镶黄旗| 老河口市| 康乐县| 永寿县| 阿拉尔市| 镇赉县| 济源市| 邛崃市| 高要市| 墨竹工卡县| 穆棱市| 临猗县| 内黄县| 五原县| 邢台市| 资源县| 南城县| 盐山县| 水城县| 利津县| 微博| 扶绥县| 静海县| 合江县| 辛集市| 邵武市| 达孜县| 磐石市| 德令哈市| 凤阳县| 西昌市| 玉田县| 泗洪县| 金门县| 临沭县| 蓝山县| 田林县| 龙陵县| 贡觉县|