官术网_书友最值得收藏!

Introduction

Indexing data is one of the most crucial things in Lucene and Solr deployment. When your data is not indexed properly, your search results will be poor. When the search results are poor, it's almost certain the users will not be satisfied with the application that uses Solr. This is why we need our data to be prepared and indexed as timely and correctly as possible.

On the other hand, preparing data is not an easy task. Nowadays, we have more and more data floating around. We need to index multiple formats of data from multiple sources. Do we need to parse the data manually and prepare the data in XML format? The answer is no; we can let Solr do this for us. This chapter will concentrate on the indexing process and data preparation, starting with how to index data that is a binary PDF file to how to use Data Import Handler to fetch data from database and index it with Apache Solr and describing how we can detect the document language during indexation. We will also learn how to modify the data during indexation so that we don't have to prepare everything upfront.

主站蜘蛛池模板: 仁怀市| 温宿县| 奎屯市| 东乌珠穆沁旗| 城固县| 抚宁县| 肇州县| 婺源县| 英山县| 忻城县| 吴川市| 南召县| 东乌珠穆沁旗| 望都县| 前郭尔| 安康市| 章丘市| 秦皇岛市| 河津市| 西乡县| 浦东新区| 林芝县| 上犹县| 黄骅市| 长乐市| 德州市| 兴城市| 泸定县| 商洛市| 乐都县| 河曲县| 通榆县| 乳山市| 潼南县| 西乌| 科尔| 和田市| 迁西县| 凤山市| 且末县| 额敏县|