官术网_书友最值得收藏!

Summary

In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandas library to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis.

We saw the effect that good features have on performance and used an ensemble algorithm, Random forests, to further improve the accuracy.

In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use approximation to improve the scalability of data mining.

主站蜘蛛池模板: 清新县| 达拉特旗| 青龙| 齐齐哈尔市| 库车县| 榆树市| 镇赉县| 娱乐| 霍城县| 绍兴市| 泰顺县| 万州区| 福海县| 志丹县| 大邑县| 出国| 天祝| 全州县| 成安县| 宜都市| 嘉峪关市| 盐边县| 琼中| 临夏市| 若尔盖县| 阿合奇县| 黄浦区| 来宾市| 湖州市| 永仁县| 花莲县| 宜都市| 民县| 习水县| 芜湖市| 福鼎市| 翁源县| 大方县| 离岛区| 晋中市| 大连市|