爱尔兰精灵网址进入

書名： Natural Language Processing Fundamentals
作者名： Sohom Ghosh Dwight Gunning
本章字數： 133字
更新時間： 2021-06-11 13:42:29

Introduction

In the previous chapter, we learned about the concepts of Natural Language Processing (NLP) and text analytics. We also looked at various pre-processing steps in brief. In this chapter, we will learn how to deal with text data whose formats are mostly unstructured. Unstructured data cannot be represented in a tabular format. Therefore, it is essential to convert it into numeric features because most machine learning algorithms are capable of dealing only with numbers. More emphasis will be put on steps such as tokenization, stemming, lemmatization, and stop-word removal. You will also learn about two popular methods for feature extraction: bag of words and Term Frequency-Inverse Document Frequency, as well as various methods for creating new features from existing features. Finally, you will become familiar with how text data can be visualized.

官术网_书友最值得收藏!

Natural Language Processing Fundamentals

Introduction