- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 188字
- 2021-07-02 22:57:19
One-hot-encoding
The one-of-K or one-hot-encoding scheme uses dummy variables to encode categorical features. Originally it was applied to digital circuits. The dummy variables have binary values like bits, so they take the values zero or one (equivalent to true or false). For instance, if we want to encode continents, we will have dummy variables, such as is_asia, which will be true if the continent is Asia and false otherwise. In general, we need as many dummy variables, as there are unique labels minus one. We can determine one of the labels automatically from the dummy variables, because the dummy variables are exclusive. If the dummy variables all have a false value, then the correct label is the label for which we don't have a dummy variable. The following table illustrates the encoding for continents:

The encoding produces a matrix (grid of numbers) with lots of zeroes (false values) and occasional ones (true values). This type of matrix is called a sparse matrix. The sparse matrix representation is handled well by the SciPy package, and shouldn't be an issue. We will discuss the SciPy package later in this chapter.
- Go Web編程
- Reporting with Visual Studio and Crystal Reports
- Oracle Exadata性能優化
- Getting started with Google Guava
- 騰訊iOS測試實踐
- 精通Scrapy網絡爬蟲
- Python應用輕松入門
- Apache Karaf Cookbook
- Cassandra Data Modeling and Analysis
- Python機器學習:手把手教你掌握150個精彩案例(微課視頻版)
- Python漫游數學王國:高等數學、線性代數、數理統計及運籌學
- 精通Python自然語言處理
- Elasticsearch Server(Third Edition)
- Android底層接口與驅動開發技術詳解
- Django實戰:Python Web典型模塊與項目開發