官术网_书友最值得收藏!

Basic tuning

So you've built a model, now what? Can you call it a day? Chances are, you'll have some optimization to do on your model. A key part of the machine learning process is the optimization of our algorithms and methods. In this section, we'll be covering the basic concepts of optimization, and will be continuing our learning of tuning methods throughout the following chapters. 

Sometimes, when our models do not perform well with new data it can be related to them overfitting or underfitting. Let's cover some methods that we can use to prevent this from happening. First off, let's look at the random forest classifier that we trained earlier. In your notebook, call the predict method on it and pass the x_test data in to receive some predictions: 

predicted = rf_classifier.predict(x_test)

From this, we can create evaluate the performance of our classifier through something known as a confusion matrix, which maps out misclassifications for us. Pandas makes this easy for us with the crosstab command:

pd.crosstab(y_test, predicted, rownames=['Actual'], colnames=['Predicted'])

You should see the output as follows: 

As you can see, our model performed fairly well on this dataset (it is a simple one after all!). What happens, however, if our model didn't perform well? Let's take a look at what could happen. 

主站蜘蛛池模板: 湟源县| 辽宁省| 抚顺县| 漾濞| 石首市| 虞城县| 修水县| 东至县| 油尖旺区| 建水县| 高要市| 茌平县| 漠河县| 阳高县| 陇西县| 延津县| 康乐县| 怀远县| 南投县| 新晃| 德保县| 鹤岗市| 江津市| 稷山县| 娄底市| 布尔津县| 邳州市| 新宁县| 崇左市| 通州区| 绍兴县| 花垣县| 荥经县| 永州市| 车险| 凤城市| 清水河县| 常熟市| 扎鲁特旗| 罗城| 德江县|