官术网_书友最值得收藏!

Summary

In this chapter, we explored the fundamental ideas surrounding issues and concerns with data quality and how to categorize quality issues by their type, as well as presented ideas for tidying up your data.

In order to compare the performance of the different models that one may create, we went on to establish some fundamental notions of model performance, such as the mean squared error (MSE) for regression and the classification error rate for classification.

We also introduced cross-validation as a generic assessment technique to be used in cases where there is a limited amount of data available.

Finally, learning curves were discussed as a way to judge the ability of a model to improve its scores or ability to learn.

With a firm grounding in the basics of the predictive modeling process, we will look at linear regression in the next chapter.

主站蜘蛛池模板: 尉犁县| 毕节市| 灯塔市| 固始县| 南川市| 苍溪县| 辰溪县| 简阳市| 永州市| 天门市| 南和县| 永福县| 鹤庆县| 永昌县| 扎兰屯市| 阿尔山市| 延津县| 汽车| 安康市| 确山县| 乳源| 公安县| 锡林郭勒盟| 晋中市| 甘南县| 乌拉特前旗| 筠连县| 新闻| 房山区| 临沭县| 桓仁| 许昌市| 陕西省| 东莞市| 赞皇县| 襄汾县| 志丹县| 聊城市| 宝丰县| 天长市| 垣曲县|