官术网_书友最值得收藏!

Label encoding

Humans are able to deal with various types of values. Machine learning algorithms (with some exceptions) need numerical values. If we offer a string such as Ivan, unless we're using specialized software, the program won't know what to do. In this example, we're dealing with a categorical feature—names, probably. We can consider each unique value to be a label. (In this particular example, we also need to decide what to do with the case—is Ivan the same as ivan?). We can then replace each label with an integer—label encoding.

The following example shows how label encoding works:

This approach can be problematic, because the learner may conclude that there's an order. For example, Asia and North America in the preceding case differ by 4 after encoding, which is a bit counter-intuitive.

主站蜘蛛池模板: 五台县| 沂南县| 汝城县| 平南县| 安国市| 隆化县| 丰顺县| 大悟县| 阿拉善右旗| 北流市| 尉犁县| 冷水江市| 尼木县| 星子县| 营口市| 紫阳县| 光山县| 伊通| 马公市| 镇平县| 望谟县| 亚东县| 德兴市| 丹巴县| 儋州市| 松溪县| 霍城县| 云梦县| 公主岭市| 萍乡市| 鄂伦春自治旗| 五家渠市| 乐陵市| 大兴区| 林甸县| 庐江县| 阜新市| 乐昌市| 鄂伦春自治旗| 哈巴河县| 聂拉木县|