官术网_书友最值得收藏!

Summary

In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandas library to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis.

We saw the effect that good features have on performance and used an ensemble algorithm, Random forests, to further improve the accuracy.

In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use approximation to improve the scalability of data mining.

主站蜘蛛池模板: 潮安县| 临清市| 南澳县| 苍溪县| 永平县| 肇源县| 安远县| 黑河市| 宣威市| 大安市| 延边| 施甸县| 临澧县| 开鲁县| 兴安县| 托克逊县| 金堂县| 广灵县| 凤山县| 昆明市| 新营市| 金阳县| 大厂| 乌兰察布市| 行唐县| 张家口市| 会昌县| 平潭县| 开封市| 文登市| 甘泉县| 渭南市| 峨眉山市| 华蓥市| 琼海市| 许昌市| 达孜县| 常宁市| 东乌珠穆沁旗| 湖口县| 中超|