- Natural Language Processing with Java and LingPipe Cookbook
- Breck Baldwin Krishna Dayanidhi
- 89字
- 2021-08-05 17:12:51
Introduction
An important part of building NLP systems is to work with the appropriate unit for processing. This chapter addresses the abstraction layer associated with the word level of processing. This is called tokenization, which amounts to grouping adjacent characters into meaningful chunks in support of classification, entity finding, and the rest of NLP.
LingPipe provides a broad range of tokenizer needs, which are not covered in this book. Look at the Javadoc for tokenizers that do stemming, Soundex (tokens based on what English words sound like), and more.
推薦閱讀
- Visual C++程序設計教程
- Learn Type:Driven Development
- iOS開發實戰:從零基礎到App Store上架
- Hands-On RESTful Web Services with Go
- Visual Basic程序設計實驗指導(第二版)
- Python High Performance Programming
- C#程序設計教程(第3版)
- Mastering React
- Instant PHP Web Scraping
- INSTANT Adobe Edge Inspect Starter
- Java7程序設計入門經典
- ROS機器人編程實戰
- VMware vRealize Orchestrator Essentials
- C++標準庫(第2版)
- Real-time Analytics with Storm and Cassandra