官术网_书友最值得收藏!

Defining your features

The second step in machine learning is defining your features. Think of features as components or attributes of the problem you wish to solve. In machine learning – specifically, when creating a new model – features are one of the biggest impacts on your model's performance. Properly thinking through your problem statement will promote an initial set of features that will drive differentiation between your dataset and model results. Going back to the Mayor example in the preceding section, what features would you consider data points for the citizen? Perhaps start by looking at the Mayor's competition and where he/she sits on issues in ways that differ from other candidates. These values could be turned into features and then made into a poll for citizens of John Doe County to answer. Using these data points would create a solid first pass at features. One aspect here that is also found in model building is running several iterations of feature engineering and model training, especially as your dataset grows. After model evaluation, feature importance is used to determine what features are actually driving your predictions. Occasionally, you will find that gut-instinct features can actually be inconsequential after a few iterations of model training and feature engineering.

In Chapter 11, Training and Building Production Models, we will deep dive into best practices when defining features and common approaches to complex problems to obtain a solid first pass at feature engineering.

主站蜘蛛池模板: 嘉黎县| 长子县| 浑源县| 景宁| 乌拉特前旗| 白城市| 商都县| 三亚市| 岱山县| 托里县| 苍溪县| 繁昌县| 高州市| 永和县| 十堰市| 丰宁| 启东市| 平舆县| 伊宁市| 两当县| 女性| 黔西县| 印江| 佛坪县| 白沙| 屏边| 海晏县| 隆子县| 比如县| 河池市| 济宁市| 安阳县| 仪征市| 新河县| 光山县| 黑龙江省| 新乐市| 拉孜县| 安徽省| 申扎县| 莱阳市|