- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 188字
- 2021-07-02 22:57:19
One-hot-encoding
The one-of-K or one-hot-encoding scheme uses dummy variables to encode categorical features. Originally it was applied to digital circuits. The dummy variables have binary values like bits, so they take the values zero or one (equivalent to true or false). For instance, if we want to encode continents, we will have dummy variables, such as is_asia, which will be true if the continent is Asia and false otherwise. In general, we need as many dummy variables, as there are unique labels minus one. We can determine one of the labels automatically from the dummy variables, because the dummy variables are exclusive. If the dummy variables all have a false value, then the correct label is the label for which we don't have a dummy variable. The following table illustrates the encoding for continents:

The encoding produces a matrix (grid of numbers) with lots of zeroes (false values) and occasional ones (true values). This type of matrix is called a sparse matrix. The sparse matrix representation is handled well by the SciPy package, and shouldn't be an issue. We will discuss the SciPy package later in this chapter.
- 程序員面試筆試寶典(第3版)
- Python入門很簡單
- R語言游戲數(shù)據(jù)分析與挖掘
- jQuery從入門到精通 (軟件開發(fā)視頻大講堂)
- 機(jī)械工程師Python編程:入門、實(shí)戰(zhàn)與進(jìn)階
- Learning Laravel's Eloquent
- Java程序員面試筆試寶典(第2版)
- Hands-On Kubernetes on Windows
- 一步一步跟我學(xué)Scratch3.0案例
- STM8實(shí)戰(zhàn)
- 黑莓(BlackBerry)開發(fā)從入門到精通
- Get Your Hands Dirty on Clean Architecture
- 軟技能2:軟件開發(fā)者職業(yè)生涯指南
- 跟小樓老師學(xué)用Axure RP 9:玩轉(zhuǎn)產(chǎn)品原型設(shè)計(jì)
- Selenium Essentials