- Learning Data Mining with Python(Second Edition)
- Robert Layton
- 196字
- 2021-07-02 23:40:07
Summary
In this chapter, we used several of scikit-learn's methods for building a standard workflow to run and evaluate data mining models. We introduced the Nearest Neighbors algorithm, which is implemented in scikit-learn as an estimator. Using this class is quite easy; first, we call the fit function on our training data, and second, we use the predict function to predict the class of testing samples.
We then looked at pre-processing by fixing poor feature scaling. This was done using a Transformer object and the MinMaxScaler class. These functions also have a fit method and then a transform, which takes data of one form as an input and returns a transformed dataset as an output.
To investigate these transformations further, try swapping out the MinMaxScaler with some of the other mentioned transformers. Which is the most effective and why would this be the case?
Other transformers also exist in scikit-learn, which we will use later in this book, such as PCA. Try some of these out as well, referencing scikit-learn's excellent documentation at https://scikit-learn.org/stable/modules/preprocessing.html
In the next chapter, we will use these concepts in a larger example, predicting the outcome of sports matches using real-world data.
- MySQL數據庫應用與管理 第2版
- Instant Typeahead.js
- Python數據挖掘與機器學習實戰
- C語言程序設計教程
- Cybersecurity Attacks:Red Team Strategies
- 寫給程序員的Python教程
- Web App Testing Using Knockout.JS
- Mastering Android Studio 3
- Clojure Web Development Essentials
- 編程的原則:改善代碼質量的101個方法
- Elasticsearch實戰(第2版)
- 深入理解Android:WebKit卷
- 區塊鏈技術與智能服務應用
- Visual FoxPro程序設計教程(第3版)
- Java 9 Cookbook