- Learning Data Mining with Python(Second Edition)
- Robert Layton
- 175字
- 2021-07-02 23:40:10
Summary
In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandaslibrary to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis.
We saw the effect that good features have on performance and used an ensemble algorithm, random forests, to further improve the accuracy. To take these concepts further, try to create your own features and test them out. Which features perform better? If you have trouble coming up with features, think about what other datasets can be included. For example, if key players are injured, this might affect the results of a specific match and cause a better team to lose.
In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use an approximation to improve the scalability of data mining.
- Bootstrap Site Blueprints Volume II
- FFmpeg入門詳解:音視頻流媒體播放器原理及應(yīng)用
- Kotlin Standard Library Cookbook
- HTML5+CSS3網(wǎng)頁設(shè)計(jì)
- HTML5入門經(jīng)典
- Web Development with MongoDB and Node(Third Edition)
- PHP+MySQL動態(tài)網(wǎng)站開發(fā)從入門到精通(視頻教學(xué)版)
- 貫通Tomcat開發(fā)
- Learning Jakarta Struts 1.2: a concise and practical tutorial
- Less Web Development Cookbook
- 你必須知道的.NET(第2版)
- WCF編程(第2版)
- 深入理解Android:WebKit卷
- Boost.Asio C++ Network Programming Cookbook
- Python輕松學(xué):爬蟲、游戲與架站