- Machine Learning with Scala Quick Start Guide
- Md. Rezaul Karim
- 361字
- 2021-06-24 14:32:01
Unsupervised learning
How would you summarize and group a dataset if the labels were not given? Probably, you'll try to answer this question by finding the underlying structure of a dataset and measuring the statistical properties such as frequency distribution, mean, standard deviation, and so on. If the question is how would you effectively represent data in a compressed format? You'll probably reply saying that you'll use some software for doing the compression, although you might have no idea how that software would do it. The following diagram shows the typical workflow of an unsupervised learning task:

These are exactly two of the main goals of unsupervised learning, which is largely a data-driven process. We call this type of learning unsupervised because you will have to deal with unlabeled data. The following quote comes from Yann LeCun, director of AI research (source: Predictive Learning, NIPS 2016, Yann LeCun, Facebook Research):
The two most widely used unsupervised learning tasks include the following:
- Clustering: Grouping data points based on similarity (or statistical properties). For example, a company such as Airbnb often groups its apartments and houses into neighborhoods so that customers can navigate the listed ones more easily.
- Dimensionality reduction: Compressing the data with the structure and statistical properties preserved as much as possible. For example, often the number of dimensions of the dataset needs to be reduced for the modeling and visualization.
- Anomaly detection: Useful in several applications such as identification of credit card fraud detection, identifying faulty pieces of hardware in an industrial engineering process, and identifying outliers in large-scale datasets.
- Association rule mining: Often used in market basket analysis, for example, asking which items are brought together and frequently.
- 后稀缺:自動(dòng)化與未來工作
- 電氣自動(dòng)化專業(yè)英語(第3版)
- 精通MATLAB圖像處理
- 傳感器技術(shù)實(shí)驗(yàn)教程
- MCSA Windows Server 2016 Certification Guide:Exam 70-741
- C語言開發(fā)技術(shù)詳解
- Containers in OpenStack
- 電氣控制與PLC原理及應(yīng)用(歐姆龍機(jī)型)
- Mastering Ceph
- PowerMill 2020五軸數(shù)控加工編程應(yīng)用實(shí)例
- 基于RPA技術(shù)財(cái)務(wù)機(jī)器人的應(yīng)用與研究
- Machine Learning with Spark(Second Edition)
- 計(jì)算機(jī)硬件技術(shù)基礎(chǔ)(第2版)
- Deep Learning Essentials
- 中文版Photoshop情境實(shí)訓(xùn)教程