書名： Mastering Java for Data Science
作者名： Alexey Grigorev
本章字數： 165字
更新時間： 2021-07-02 23:44:31

Dimensionality reduction

Another group of unsupervised learning algorithms is dimensionality reduction algorithms. This group of algorithms compresses the dataset, keeping only the most useful information. If our dataset has too much information, it can be hard for a machine learning algorithm to use all of it at the same time. It may just take too long for the algorithm to process all the data and we would like to compress the data, so processing it takes less time.

There are multiple algorithms that can reduce the dimensionality of the data, including Principal Component Analysis (PCA), Locally linear embedding, and t-SNE. All these algorithms are examples of unsupervised dimensionality reduction techniques.

Not all dimensionality reduction algorithms are unsupervised; some of them can use labels to reduce the dimensionality better. For example, many feature selection algorithms rely on labels to see what features are useful and what are not.

We will talk more about this in Chapter 5, Unsupervised Learning - Clustering and Dimensionality Reduction.

官术网_书友最值得收藏!

Mastering Java for Data Science

Dimensionality reduction