官术网_书友最值得收藏!

Label encoding

Humans are able to deal with various types of values. Machine learning algorithms (with some exceptions) need numerical values. If we offer a string such as Ivan, unless we're using specialized software, the program won't know what to do. In this example, we're dealing with a categorical feature—names, probably. We can consider each unique value to be a label. (In this particular example, we also need to decide what to do with the case—is Ivan the same as ivan?). We can then replace each label with an integer—label encoding.

The following example shows how label encoding works:

This approach can be problematic, because the learner may conclude that there's an order. For example, Asia and North America in the preceding case differ by 4 after encoding, which is a bit counter-intuitive.

主站蜘蛛池模板: 阿拉善左旗| 盐边县| 庐江县| 吉木萨尔县| 永兴县| 鄢陵县| 田阳县| 鹤岗市| 招远市| 南澳县| 丹凤县| 麦盖提县| 乌兰察布市| 宽城| 泰和县| 德保县| 靖远县| 陇川县| 囊谦县| 江华| 论坛| 香港 | 临沧市| 桓台县| 江西省| 门源| 广东省| 卓资县| 新巴尔虎左旗| 大同县| 宣威市| 丰县| 始兴县| 南木林县| 若尔盖县| 长葛市| 普兰县| 左云县| 邯郸县| 沾化县| 雅江县|