官术网_书友最值得收藏!

Summary

In this chapter, we explored the fundamental ideas surrounding issues and concerns with data quality and how to categorize quality issues by their type, as well as presented ideas for tidying up your data.

In order to compare the performance of the different models that one may create, we went on to establish some fundamental notions of model performance, such as the mean squared error (MSE) for regression and the classification error rate for classification.

We also introduced cross-validation as a generic assessment technique to be used in cases where there is a limited amount of data available.

Finally, learning curves were discussed as a way to judge the ability of a model to improve its scores or ability to learn.

With a firm grounding in the basics of the predictive modeling process, we will look at linear regression in the next chapter.

主站蜘蛛池模板: 三门峡市| 衡水市| 平谷区| 万荣县| 淅川县| 惠州市| 天门市| 张家港市| 隆回县| 陵川县| 迁安市| 苍溪县| 库车县| 阳西县| 吴堡县| 遵化市| 武鸣县| 简阳市| 麟游县| 专栏| 营山县| 高台县| 中卫市| 广宗县| 平陆县| 察哈| 灵山县| 中宁县| 洛川县| 宜城市| 珲春市| 河源市| 阿克陶县| 诸暨市| 兴和县| 拜城县| 华容县| 新巴尔虎左旗| 福清市| 定南县| 汶上县|