官术网_书友最值得收藏!

Binning

Sometimes, it's useful to separate feature values into several bins. For example, we may be only interested whether it rained on a particular day. Given the precipitation values, we can binarize the values, so that we get a true value if the precipitation value isn't zero and a false value otherwise. We can also use statistics to divide values into high, low, and medium bins. In marketing, we often care more about the age group, such as 18 to 24, than a specific age such as 23.

The binning process inevitably leads to loss of information. However, depending on your goals, this may not be an issue, and actually reduces the chance of overfitting. Certainly, there will be improvements in speed and reduction of memory or storage requirements and redundancy.

主站蜘蛛池模板: 井研县| 平果县| 洛川县| 鄂州市| 延川县| 大同市| 讷河市| 许昌市| 宝坻区| 安陆市| 元谋县| 德令哈市| 永顺县| 金阳县| 昌邑市| 长泰县| 富宁县| 防城港市| 格尔木市| 沁源县| 板桥市| 阿荣旗| 涟水县| 苍梧县| 十堰市| 宿迁市| 东乌| 彝良县| 宜兰县| 宿州市| 博客| 三穗县| 连城县| 集贤县| 祥云县| 南京市| 商水县| 叶城县| 鄢陵县| 通江县| 科尔|