- Python 3 Text Processing with NLTK 3 Cookbook
- Jacob Perkins
- 168字
- 2021-09-03 09:45:38
Introduction
In this chapter, we'll cover how to use corpus readers and create custom corpora. If you want to train your own model, such as a part-of-speech tagger or text classifier, you will need to create a custom corpus to train on. Model training is covered in the subsequent chapters.
Now you'll learn how to use the existing corpus data that comes with NLTK. This information is essential for future chapters when we'll need to access the corpora as training data. You've already accessed the WordNet corpus in Chapter 1, Tokenizing Text and WordNet Basics. This chapter will introduce you to many more corpora.
We'll also cover creating custom corpus readers, which can be used when your corpus is not in a file format that NLTK already recognizes, or if your corpus is not located in files at all, but instead is located in a database such as MongoDB. It is essential to be familiar with tokenization, which was covered in Chapter 1, Tokenizing Text and WordNet Basics.
- Data Visualization with D3 4.x Cookbook(Second Edition)
- Mastering RabbitMQ
- CMDB分步構建指南
- Mastering ServiceStack
- Rust實戰
- 青少年軟件編程基礎與實戰(圖形化編程三級)
- Machine Learning with R Cookbook(Second Edition)
- Oracle BAM 11gR1 Handbook
- AutoCAD VBA參數化繪圖程序開發與實戰編碼
- iOS編程基礎:Swift、Xcode和Cocoa入門指南
- Go并發編程實戰
- Spring Boot進階:原理、實戰與面試題分析
- 單片機應用與調試項目教程(C語言版)
- Linux Device Drivers Development
- Mastering JBoss Enterprise Application Platform 7