- Natural Language Processing with Java and LingPipe Cookbook
- Breck Baldwin Krishna Dayanidhi
- 89字
- 2021-08-05 17:12:51
Introduction
An important part of building NLP systems is to work with the appropriate unit for processing. This chapter addresses the abstraction layer associated with the word level of processing. This is called tokenization, which amounts to grouping adjacent characters into meaningful chunks in support of classification, entity finding, and the rest of NLP.
LingPipe provides a broad range of tokenizer needs, which are not covered in this book. Look at the Javadoc for tokenizers that do stemming, Soundex (tokens based on what English words sound like), and more.
推薦閱讀
- Delphi程序設計基礎:教程、實驗、習題
- WebAssembly實戰
- 程序員修煉之道:通向務實的最高境界(第2版)
- Windows Phone 7.5:Building Location-aware Applications
- HTML5 APP開發從入門到精通(微課精編版)
- INSTANT Adobe Edge Inspect Starter
- 時空數據建模及其應用
- Mastering Elixir
- Oracle實用教程
- H5+移動營銷設計寶典
- 數據結構:Python語言描述
- C# 7.1 and .NET Core 2.0:Modern Cross-Platform Development(Third Edition)
- Swift High Performance
- 前端Serverless:面向全棧的無服務器架構實戰
- Computer Vision with Python 3