書(shū)名： Hands-On Natural Language Processing with Python
作者名： Rajesh Arumugam Rajalingappaa Shanmugamani
本章字?jǐn)?shù)： 163字
更新時(shí)間： 2021-08-13 16:01:42

Text Classification and POS Tagging Using NLTK

The Natural Language Toolkit (NLTK) is a Python library for handling natural language processing (NLP) tasks, ranging from segmenting words or sentences to performing advanced tasks, such as parsing grammar and classifying text. NLTK provides several modules and interfaces to work on natural language, useful for tasks such as document topic identification, parts of speech (POS) tagging, sentiment analysis, and so on. For experimentation with various NLP tasks, NLTK also includes modules for a wide range of text corpora, from basic text collections to tagged and structured texts, such as WordNet. While the NLTK library provides a vast set of APIs, we will only cover the most important aspects that are commonly used in practical NLP applications.

We will cover the following topics in this chapter:

Installing NLTK and its modules
Text preprocessing and exploratory analysis
Exploratory analysis of text
POS tagging
Training a sentiment classifier for movie reviews
Training a bag-of-words classifier

官术网_书友最值得收藏!

Hands-On Natural Language Processing with Python

Text Classification and POS Tagging Using NLTK