官术网_书友最值得收藏!

Label encoding

Humans are able to deal with various types of values. Machine learning algorithms (with some exceptions) need numerical values. If we offer a string such as Ivan, unless we're using specialized software, the program won't know what to do. In this example, we're dealing with a categorical feature—names, probably. We can consider each unique value to be a label. (In this particular example, we also need to decide what to do with the case—is Ivan the same as ivan?). We can then replace each label with an integer—label encoding.

The following example shows how label encoding works:

This approach can be problematic, because the learner may conclude that there's an order. For example, Asia and North America in the preceding case differ by 4 after encoding, which is a bit counter-intuitive.

主站蜘蛛池模板: 乳山市| 波密县| 岳池县| 柘荣县| 临颍县| 封丘县| 获嘉县| 张家口市| 宁蒗| 个旧市| 舞钢市| 绥中县| 贡嘎县| 兴国县| 布拖县| 大安市| 韶关市| 盱眙县| 张家口市| 绥芬河市| 秭归县| 油尖旺区| 建湖县| 新沂市| 丹江口市| 遂宁市| 桃园市| 黄梅县| 年辖:市辖区| 长汀县| 莱芜市| 贵溪市| 金阳县| 陆川县| 新蔡县| 揭阳市| 威远县| 襄樊市| 黎川县| 平和县| 扬中市|