- Natural Language Processing Fundamentals
- Sohom Ghosh Dwight Gunning
- 138字
- 2021-06-11 13:42:31
Summary
In this chapter, you have learned about various types of data and ways to deal with unstructured text data. Text data is usually untidy and needs to be cleaned and pre-processed. Pre-processing steps mainly consist of tokenization, stemming, lemmatization, and stop-word removal. After pre-processing, features are extracted from texts using various methods, such as BoW and TF-IDF. This step converts unstructured text data into structured numeric data. New features are created from existing features using a technique called feature engineering. In the last part of the chapter, we explored various ways of visualizing text data, such as word clouds.
In the next chapter, you will learn how to develop machine learning models to classify texts using the features you have learned to extract in this chapter. Moreover, different sampling techniques and model evaluation parameters will be introduced.
- Hands-On Intelligent Agents with OpenAI Gym
- PowerShell 3.0 Advanced Administration Handbook
- Excel 2007函數與公式自學寶典
- 程序設計語言與編譯
- 自動檢測與傳感技術
- UTM(統一威脅管理)技術概論
- 最后一個人類
- Multimedia Programming with Pure Data
- Python:Data Analytics and Visualization
- 內模控制及其應用
- Learning ServiceNow
- Mastering Text Mining with R
- PHP求職寶典
- 新世紀Photoshop CS6中文版應用教程
- 超好玩的Python少兒編程