官术网_书友最值得收藏!

Basic tuning

So you've built a model, now what? Can you call it a day? Chances are, you'll have some optimization to do on your model. A key part of the machine learning process is the optimization of our algorithms and methods. In this section, we'll be covering the basic concepts of optimization, and will be continuing our learning of tuning methods throughout the following chapters. 

Sometimes, when our models do not perform well with new data it can be related to them overfitting or underfitting. Let's cover some methods that we can use to prevent this from happening. First off, let's look at the random forest classifier that we trained earlier. In your notebook, call the predict method on it and pass the x_test data in to receive some predictions: 

predicted = rf_classifier.predict(x_test)

From this, we can create evaluate the performance of our classifier through something known as a confusion matrix, which maps out misclassifications for us. Pandas makes this easy for us with the crosstab command:

pd.crosstab(y_test, predicted, rownames=['Actual'], colnames=['Predicted'])

You should see the output as follows: 

As you can see, our model performed fairly well on this dataset (it is a simple one after all!). What happens, however, if our model didn't perform well? Let's take a look at what could happen. 

主站蜘蛛池模板: 旬邑县| 塔河县| 句容市| 渝北区| 桃源县| 虎林市| 武山县| 遂平县| 静乐县| 洱源县| 驻马店市| 昌宁县| 北安市| 伊通| 尉氏县| 陇南市| 昌乐县| 巫溪县| 麟游县| 磐安县| 自贡市| 美姑县| 长顺县| 五常市| 通化市| 当涂县| 曲麻莱县| 将乐县| 大余县| 邵阳市| 临澧县| 弋阳县| 镇沅| 柏乡县| 祁连县| 恩施市| 乌海市| 犍为县| 靖宇县| 株洲县| 博湖县|