最新章節
- Index
- The John Smith problem
- Cross-document coreference
- Adding pronouns to coreference
- Named entity coreference with a document
- Introduction
品牌:中圖公司
上架時間:2021-08-05 16:44:37
出版社:Packt Publishing
本書數字版權由中圖公司提供,并由其授權上海閱文信息技術有限公司制作發行
- Index 更新時間:2021-08-05 17:13:04
- The John Smith problem
- Cross-document coreference
- Adding pronouns to coreference
- Named entity coreference with a document
- Introduction
- Chapter 7. Finding Coreference Between Concepts/People
- Latent Dirichlet allocation (LDA) for multitopic clustering
- Single-link and complete-link clustering using edit distance
- Automatic phrase completion
- The case restoring corrector
- Using edit distance and language models for spelling correction
- The Tf-Idf distance
- The Jaccard distance
- Weighted edit distance
- Distance and proximity – simple edit distance
- Introduction
- Chapter 6. String Comparison and Clustering
- NER using CRFs with better features
- CRFs for chunking
- Mixing the NER sources
- HMM-based NER
- Translating between word tagging and chunks – BIO codec
- Dictionary-based chunking for NER
- Regular expression-based chunking for NER
- Simple noun phrases and verb phrases
- Paragraph detection
- Marking embedded chunks in a string – sentence chunk example
- Tuning sentence detection
- Evaluation of sentence detection
- Sentence detection
- Introduction
- Chapter 5. Finding Spans in Text – Chunking
- Modifying CRFs
- Conditional random fields (CRF) for word/token tagging
- Word-tagging evaluation
- Training word tagging
- Confidence-based tagging
- N-best word tagging
- Hidden Markov Models (HMM) – part-of-speech
- Foreground- or background-driven interesting phrase detection
- Interesting phrase detection
- Introduction
- Chapter 4. Tagging Words and Tokens
- Annotation
- Train a little learn a little – active learning
- Thresholding classifiers
- Linguistic tuning
- Classifier-building life cycle
- Combining feature extractors
- Customizing feature extraction
- Tuning parameters in logistic regression
- Multithreaded cross validation
- Logistic regression
- Feature extractors
- Na?ve Bayes
- Language model classifier with tokens
- A simple classifier
- Introduction
- Chapter 3. Advanced Classifiers
- Finding words for languages without white spaces
- Modifying tokenizer factories
- Evaluating tokenizers with unit tests
- Using Lucene/Solr tokenizers with LingPipe
- Using Lucene/Solr tokenizers
- Combining tokenizers – stop word tokenizers
- Combining tokenizers – lowercase tokenizer
- Introduction to tokenizer factories – finding words in a character stream
- Introduction
- Chapter 2. Finding and Working with Words
- How to classify sentiment – simple version
- Eliminate near duplicates with the Jaccard distance
- How to serialize a LingPipe object – classifier example
- Understanding precision and recall
- Viewing error categories – false positives
- How to train and evaluate with cross validation
- Training your own language model classifier
- Evaluation of classifiers – the confusion matrix
- Applying a classifier to a .csv file
- Getting data from the Twitter API
- Getting confidence estimates from a classifier
- Deserializing and running a classifier
- Introduction
- Chapter 1. Simple Classifiers
- Customer support
- Reader feedback
- Conventions
- Who this book is for
- What you need for this book
- What this book covers
- Preface
- Support files eBooks discount offers and more
- www.PacktPub.com
- About the Reviewers
- About the Authors
- Credits
- 版權頁
- 封面
- 封面
- 版權頁
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Support files eBooks discount offers and more
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Chapter 1. Simple Classifiers
- Introduction
- Deserializing and running a classifier
- Getting confidence estimates from a classifier
- Getting data from the Twitter API
- Applying a classifier to a .csv file
- Evaluation of classifiers – the confusion matrix
- Training your own language model classifier
- How to train and evaluate with cross validation
- Viewing error categories – false positives
- Understanding precision and recall
- How to serialize a LingPipe object – classifier example
- Eliminate near duplicates with the Jaccard distance
- How to classify sentiment – simple version
- Chapter 2. Finding and Working with Words
- Introduction
- Introduction to tokenizer factories – finding words in a character stream
- Combining tokenizers – lowercase tokenizer
- Combining tokenizers – stop word tokenizers
- Using Lucene/Solr tokenizers
- Using Lucene/Solr tokenizers with LingPipe
- Evaluating tokenizers with unit tests
- Modifying tokenizer factories
- Finding words for languages without white spaces
- Chapter 3. Advanced Classifiers
- Introduction
- A simple classifier
- Language model classifier with tokens
- Na?ve Bayes
- Feature extractors
- Logistic regression
- Multithreaded cross validation
- Tuning parameters in logistic regression
- Customizing feature extraction
- Combining feature extractors
- Classifier-building life cycle
- Linguistic tuning
- Thresholding classifiers
- Train a little learn a little – active learning
- Annotation
- Chapter 4. Tagging Words and Tokens
- Introduction
- Interesting phrase detection
- Foreground- or background-driven interesting phrase detection
- Hidden Markov Models (HMM) – part-of-speech
- N-best word tagging
- Confidence-based tagging
- Training word tagging
- Word-tagging evaluation
- Conditional random fields (CRF) for word/token tagging
- Modifying CRFs
- Chapter 5. Finding Spans in Text – Chunking
- Introduction
- Sentence detection
- Evaluation of sentence detection
- Tuning sentence detection
- Marking embedded chunks in a string – sentence chunk example
- Paragraph detection
- Simple noun phrases and verb phrases
- Regular expression-based chunking for NER
- Dictionary-based chunking for NER
- Translating between word tagging and chunks – BIO codec
- HMM-based NER
- Mixing the NER sources
- CRFs for chunking
- NER using CRFs with better features
- Chapter 6. String Comparison and Clustering
- Introduction
- Distance and proximity – simple edit distance
- Weighted edit distance
- The Jaccard distance
- The Tf-Idf distance
- Using edit distance and language models for spelling correction
- The case restoring corrector
- Automatic phrase completion
- Single-link and complete-link clustering using edit distance
- Latent Dirichlet allocation (LDA) for multitopic clustering
- Chapter 7. Finding Coreference Between Concepts/People
- Introduction
- Named entity coreference with a document
- Adding pronouns to coreference
- Cross-document coreference
- The John Smith problem
- Index 更新時間:2021-08-05 17:13:04