官术网_书友最值得收藏!

Basic tuning

So you've built a model, now what? Can you call it a day? Chances are, you'll have some optimization to do on your model. A key part of the machine learning process is the optimization of our algorithms and methods. In this section, we'll be covering the basic concepts of optimization, and will be continuing our learning of tuning methods throughout the following chapters. 

Sometimes, when our models do not perform well with new data it can be related to them overfitting or underfitting. Let's cover some methods that we can use to prevent this from happening. First off, let's look at the random forest classifier that we trained earlier. In your notebook, call the predict method on it and pass the x_test data in to receive some predictions: 

predicted = rf_classifier.predict(x_test)

From this, we can create evaluate the performance of our classifier through something known as a confusion matrix, which maps out misclassifications for us. Pandas makes this easy for us with the crosstab command:

pd.crosstab(y_test, predicted, rownames=['Actual'], colnames=['Predicted'])

You should see the output as follows: 

As you can see, our model performed fairly well on this dataset (it is a simple one after all!). What happens, however, if our model didn't perform well? Let's take a look at what could happen. 

主站蜘蛛池模板: 工布江达县| 平南县| 巢湖市| 亳州市| 五大连池市| 黎川县| 温州市| 化德县| 辛集市| 静海县| 咸阳市| 称多县| 三穗县| 洞口县| 通州市| 桦南县| 龙里县| 岢岚县| 蒙自县| 夏邑县| 彰武县| 仁怀市| 霍林郭勒市| 永平县| 阜平县| 信阳市| 永城市| 嘉鱼县| 保定市| 孟连| 邳州市| 孝感市| 江油市| 大荔县| 河池市| 宁南县| 永州市| 永福县| 收藏| 曲麻莱县| 安顺市|