- Scala for Machine Learning(Second Edition)
- Patrick R. Nicolas
- 143字
- 2021-07-08 10:43:10
Chapter 5. Dimension Reduction
As described in the Assessing a model/overfitting section of Chapter 2, Data Pipelines, the indiscriminative reliance of a large number of features may cause overfitting; the model may become so tightly coupled with the training set that different validation sets will generate a vastly different outcome and quality metrics such as AuROC.
Dimension reduction techniques alleviate these problems by detecting features that have little influence on the overall model behavior.
This chapter introduces three categories of dimension reduction techniques with two implementations in Scala:
- Divergence with an implementation of the Kullback-Leibler distance
- Principal components analysis
- Estimation of low dimension feature space for nonlinear models
Other types of methodologies used to reduce the number of features such as regularization or singular value decomposition are discussed in future chapters.
But first, let's start our investigation by defining the problem.
- C++面向?qū)ο蟪绦蛟O(shè)計(jì)(第三版)
- VMware View Security Essentials
- PHP基礎(chǔ)案例教程
- 零基礎(chǔ)學(xué)Java(第4版)
- Windows Server 2012 Unified Remote Access Planning and Deployment
- Python貝葉斯分析(第2版)
- Learning Salesforce Einstein
- 0 bug:C/C++商用工程之道
- 細(xì)說Python編程:從入門到科學(xué)計(jì)算
- Python機(jī)器學(xué)習(xí)算法與應(yīng)用
- Access 2010數(shù)據(jù)庫應(yīng)用技術(shù)實(shí)驗(yàn)指導(dǎo)與習(xí)題選解(第2版)
- Java程序設(shè)計(jì)與項(xiàng)目案例教程
- Android Studio Cookbook
- Node.js區(qū)塊鏈開發(fā)
- Mastering Leap Motion