饿了么跑外卖下什么软件

書名： Python 3 Text Processing with NLTK 3 Cookbook
作者名： Jacob Perkins
本章字數： 219字
更新時間： 2021-09-03 09:45:34

Introduction

Natural Language ToolKit (NLTK) is a comprehensive Python library for natural language processing and text analytics. Originally designed for teaching, it has been adopted in the industry for research and development due to its usefulness and breadth of coverage. NLTK is often used for rapid prototyping of text processing programs and can even be used in production applications. Demos of select NLTK functionality and production-ready APIs are available at http://text-processing.com.

This chapter will cover the basics of tokenizing text and using WordNet. Tokenization is a method of breaking up a piece of text into many pieces, such as sentences and words, and is an essential first step for recipes in the later chapters. WordNet is a dictionary designed for programmatic access by natural language processing systems. It has many different use cases, including:

Looking up the definition of a word
Finding synonyms and antonyms
Exploring word relations and similarity
Word sense disambiguation for words that have multiple uses and definitions

NLTK includes a WordNet corpus reader, which we will use to access and explore WordNet. A corpus is just a body of text, and corpus readers are designed to make accessing a corpus much easier than direct file access. We'll be using WordNet again in the later chapters, so it's important to familiarize yourself with the basics first.

官术网_书友最值得收藏!

Python 3 Text Processing with NLTK 3 Cookbook

Introduction