- Learning Data Mining with Python(Second Edition)
- Robert Layton
- 265字
- 2021-07-02 23:40:09
Using decision trees
We can import the DecisionTreeClassifier class and create a Decision Tree using scikit-learn:
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=14)
We used 14 for our random_state again and will do so for most of the book. Using the same random seed allows for replication of experiments. However, with your experiments, you should mix up the random state to ensure that the algorithm's performance is not tied to the specific value.
We now need to extract the dataset from our pandas data frame in order to use it with our scikit-learn classifier. We do this by specifying the columns we wish to use and using the values parameter of a view of the data frame. The following code creates a dataset using our last win values for both the home team and the visitor team:
X_previouswins = dataset[["HomeLastWin", "VisitorLastWin"]].values
Decision trees are estimators, as introduced in Chapter 2, Classifying using scikit-learn Estimators, and therefore have fit and predict methods. We can also use the cross_val_score method to get the average score (as we did previously):
from sklearn.cross_validation import cross_val_score
import numpy as np
scores = cross_val_score(clf, X_previouswins, y_true,
scoring='accuracy')
print("Accuracy: {0:.1f}%".format(np.mean(scores) * 100))
This scores 59.4 percent: we are better than choosing randomly! However, we aren't beating our other baseline of just choosing the home team. In fact, we are pretty much exactly the same. We should be able to do better. Feature engineering is one of the most difficult tasks in data mining, and choosing good features is key to getting good outcomes—more so than choosing the right algorithm!
- PyTorch自動(dòng)駕駛視覺(jué)感知算法實(shí)戰(zhàn)
- JavaScript+jQuery網(wǎng)頁(yè)特效設(shè)計(jì)任務(wù)驅(qū)動(dòng)教程(第2版)
- The Data Visualization Workshop
- JavaScript+Vue+React全程實(shí)例
- JavaScript:Moving to ES2015
- 精通Python設(shè)計(jì)模式(第2版)
- 常用工具軟件立體化教程(微課版)
- 計(jì)算機(jī)應(yīng)用基礎(chǔ)教程(Windows 7+Office 2010)
- Windows Phone 8 Game Development
- Magento 2 Beginners Guide
- 深度實(shí)踐KVM:核心技術(shù)、管理運(yùn)維、性能優(yōu)化與項(xiàng)目實(shí)施
- 第五空間戰(zhàn)略:大國(guó)間的網(wǎng)絡(luò)博弈
- 打造流暢的Android App
- Java程序員面試筆試真題庫(kù)
- Game Physics Cookbook