官术网_书友最值得收藏!

Building Your NLP Vocabulary

In the earlier chapters, you were introduced to why Natural Language Processing (NLP) is important especially in today's context, which was followed by a discussion on a few prerequisites and Python libraries that are highly beneficial for NLP tasks. In this chapter, we will take this discussion further and discuss some of the most concrete tasks involved in building a vocabulary for NLP tasks and preprocessing textual data in detail. We will start by learning what a vocabulary is and take the notion forward to actually build a vocabulary. We will do this by applying various methods on text data that are present in most of the NLP pipelines across any organization.

In this chapter, we'll cover the following topics:

  • Lexicons
  • Phonemes, graphemes, and morphemes
  • Tokenization
  • Understanding word normalization
主站蜘蛛池模板: 武川县| 高安市| 泾源县| 达拉特旗| 双城市| 嵩明县| 南丰县| 会昌县| 新营市| 临泉县| 凤冈县| 容城县| 宁阳县| 冕宁县| 高清| 阿城市| 上林县| 革吉县| 瑞安市| 绵竹市| 元氏县| 克山县| 石嘴山市| 调兵山市| 开封市| 常宁市| 淮南市| 敖汉旗| 珲春市| 始兴县| 禄丰县| 蒙山县| 青冈县| 孝义市| 女性| 丹巴县| 绍兴县| 澄江县| 景泰县| 林周县| 南靖县|