官术网_书友最值得收藏!

Binning

Sometimes, it's useful to separate feature values into several bins. For example, we may be only interested whether it rained on a particular day. Given the precipitation values, we can binarize the values, so that we get a true value if the precipitation value isn't zero and a false value otherwise. We can also use statistics to divide values into high, low, and medium bins. In marketing, we often care more about the age group, such as 18 to 24, than a specific age such as 23.

The binning process inevitably leads to loss of information. However, depending on your goals, this may not be an issue, and actually reduces the chance of overfitting. Certainly, there will be improvements in speed and reduction of memory or storage requirements and redundancy.

主站蜘蛛池模板: 石景山区| 江门市| 禄丰县| 富顺县| 奉化市| 游戏| 凤山县| 霍山县| 离岛区| 乐安县| 定襄县| 云梦县| 巫山县| 呼玛县| 古蔺县| 容城县| 湖南省| 武胜县| 疏勒县| 当涂县| 开平市| 安阳市| 德保县| 枣庄市| 冕宁县| 乌兰县| 青河县| 庆阳市| 香港 | 平乡县| 甘孜县| 阳城县| 隆尧县| 洛浦县| 石屏县| 红河县| 金溪县| 鹤岗市| 庆元县| 五家渠市| 楚雄市|