- R Machine Learning Projects
- Dr. Sunil Kumar Chinnamgari
- 251字
- 2021-07-02 14:23:07
Dimensionality reduction
Feature reduction (or feature selection) or dimensionality reduction is the process of reducing the input set of independent variables to obtain a lesser number of variables that are really required by the model to predict the target.
In certain cases, it is possible to represent multiple dependent variables by combining them together without losing much information. For example, instead of having two independent variables such as the length of a rectangle and the breath of a rectangle, the dimensions can be represented by only one variable called the area that represents both the length and breadth of the rectangle.
The following mentioned are the multiple reasons we need to perform a dimensionality reduction on a given input dataset:
- To aid data compression, therefore accommodate the data in a smaller amount of disk space.
- The time to process the data is reduced as fewer dimensions are used to represent the data.
- It removes redundant features from datasets. Redundant features are typically known as multicollinearity in data.
- Reducing the data to fewer dimensions helps visualize the data through graphs and charts.
- Dimensionality reduction removes noisy features from the dataset which, in turn, improves the model performance.
There are many ways by which dimensionality reduction can be attained in a dataset. The use of filters, such as information gain filters, and symmetric attribute evaluation filters, is one way. Genetic-algorithm-based selection and principal component analysis (PCA) are other popular techniques used to achieve dimensionality reduction. Hybrid methods do exist to attain feature selection.
- 自動控制工程設計入門
- Practical Data Wrangling
- ServiceNow Cookbook
- 數(shù)據(jù)挖掘實用案例分析
- 大數(shù)據(jù)時代的數(shù)據(jù)挖掘
- Hands-On Machine Learning with TensorFlow.js
- 自動生產(chǎn)線的拆裝與調試
- 零起點學西門子S7-200 PLC
- Working with Linux:Quick Hacks for the Command Line
- 生物3D打印:從醫(yī)療輔具制造到細胞打印
- Hands-On SAS for Data Analysis
- 西門子S7-1200/1500 PLC從入門到精通
- Instant Slic3r
- 中老年人學電腦與上網(wǎng)
- 網(wǎng)絡信息安全項目教程