官术网_书友最值得收藏!

Introduction

Indexing data is one of the most crucial things in Lucene and Solr deployment. When your data is not indexed properly, your search results will be poor. When the search results are poor, it's almost certain the users will not be satisfied with the application that uses Solr. This is why we need our data to be prepared and indexed as timely and correctly as possible.

On the other hand, preparing data is not an easy task. Nowadays, we have more and more data floating around. We need to index multiple formats of data from multiple sources. Do we need to parse the data manually and prepare the data in XML format? The answer is no; we can let Solr do this for us. This chapter will concentrate on the indexing process and data preparation, starting with how to index data that is a binary PDF file to how to use Data Import Handler to fetch data from database and index it with Apache Solr and describing how we can detect the document language during indexation. We will also learn how to modify the data during indexation so that we don't have to prepare everything upfront.

主站蜘蛛池模板: 周至县| 南雄市| 内黄县| 福建省| 涟源市| 新余市| 吉木萨尔县| 威远县| 淮南市| 仲巴县| 通州区| 辽中县| 陆河县| 定边县| 新晃| 肥乡县| 西林县| 江川县| 松桃| 武邑县| 昌邑市| 南华县| 务川| 台中县| 藁城市| 鸡泽县| 阜阳市| 龙江县| 朝阳县| 丰原市| 江油市| 社旗县| 望都县| 高要市| 库伦旗| 惠州市| 滦南县| 千阳县| 万荣县| 香格里拉县| 张家港市|