- Lucene 4 Cookbook
- Edwood Ng Vineeth Mohan
- 222字
- 2021-07-16 14:07:51
Defining custom analyzers
It's necessary to create a custom analyzer when the built-in analyzers do not provide the needed behaviors for your search application. To continue with our CourtesyTitleFilter example, we will create CourtesyTitleAnalyzer
.
The anatomy of an analyzer includes one tokenizer and one or more TokenFilters. We will build an Analyzer by extending from the Analyzer
abstract class and implement the createComponents method.
How to do it…
Here is the sample code for CourtesyTitleAnalyzer
:
public class CourtesyTitleAnalyzer extends Analyzer { @Override protected TokenStreamComponents createComponents(String fieldName, Reader reader) { Tokenizer letterTokenizer = new LetterTokenizer(reader); TokenStream filter = new CourtesyTitleFilter(letterTokenizer); return new TokenStreamComponents(letterTokenizer, filter); } }
How it works…
An Analyzer is created by extending from the Analyzer
abstract class as shown in this example. Then we override the createComponents
method, adding a LetterTokenizer
to split text by non-letter characters and CourtesyTitleFilter
as a TokenFilter. Finally, we return a new TokenStreamComponents
instance initialized by the instantiated Tokenizer
and TokenFilter
.
Note that the only method we need to override is createComponents
. We don't need to override the constructor to build our Analyzer because components are not added during construction; they are added when the createComponents
method is called. Therefore, we override the createComponents
method to customize an Analyzer. Also note that we cannot override the tokenStream
method because it's declared as final.
- Mastering Selenium WebDriver
- 零基礎(chǔ)學(xué)Scratch少兒編程:小學(xué)課本中的Scratch創(chuàng)意編程
- Android Studio Essentials
- 網(wǎng)頁(yè)設(shè)計(jì)與制作教程(HTML+CSS+JavaScript)(第2版)
- The Computer Vision Workshop
- 劍指MySQL:架構(gòu)、調(diào)優(yōu)與運(yùn)維
- 碼上行動(dòng):用ChatGPT學(xué)會(huì)Python編程
- Natural Language Processing with Python Quick Start Guide
- 從程序員角度學(xué)習(xí)數(shù)據(jù)庫(kù)技術(shù)(藍(lán)橋杯軟件大賽培訓(xùn)教材-Java方向)
- Java7程序設(shè)計(jì)入門經(jīng)典
- Elastix Unified Communications Server Cookbook
- Slick2D Game Development
- Mastering Responsive Web Design
- C語(yǔ)言學(xué)習(xí)手冊(cè)
- 零基礎(chǔ)Linux從入門到精通