官术网_书友最值得收藏!

  • Elasticsearch Blueprints
  • Vineeth Mohan
  • 651字
  • 2021-07-16 13:39:32

Choosing between a query and a filter

The basic idea of a search is to narrow down on a subset of documents that you have. In the Elasticsearch context, this means that based on various conditions, you might want to select a set of documents from an index or a set of index. A query and filter facilitate this for you.

If you have already gone through the reference guide or some other documentation of Elasticsearch, you might have noticed that the same set of operations might be available for both queries and filters. So, what are the differentiating factors of a query and filter even when the set of operations given by them are almost the same? Let's find out.

In a query, a matched document can be a better match than another matched document. In a filter, all the matched documents are treated equally.

This means that there is a way to score or rank a document matched against a query to another document match. This is done by computing a score value that tells you how good a match a particular document is against a query. If the query is a better match, give a higher score and if it's a lesser match, give a lesser score. This way, we can identify the best matches and use that in the paging.

For an e-commerce site, the success is decided on what percentage of input traffic is converted to purchase. A customer searches for something he/she is interested in buying and if we can't show him the most relevant results in the first page itself, then the chance of converting the search into a purchase would be slim. Mostly, none of the customers would look at the second page or subsequent pages for best options. They will assume that the products in further pages are of lesser importance than the current page and will drop the search there. Hence, we have to use queries to make our result order more relevant to the user.

But wait, what are the advantages of filters? Let's explore them.

Note

Filters don't compute the matched score per document and hence, they are faster. The results are also cached, which means that from the second search, the speed will be really good.

So, for structured searches, such as a date range, number range, and so on, where scoring doesn't come in picture, filters is our man. It has to be noted that filters can be used in many areas. They are:

  • Queries: A filter can be used for querying. Note that like a query has a separate section called query in a Query DSL (domain-specific language), there is no separate section for filters. Rather, you need to embed your filter inside the constant_score query type or the filtered_query type.
  • Scoring: Elasticsearch provides you a query type called the function_score query. Using the capabilities of this query type, we can use a filter and boost the score based on the filter match.
  • A post filter: This is applied to the search results, but not to the input of the aggregation. This means that even though the scope of aggregation is its query, we can modify this behavior by adding the post filter. Post filters are only applied to search results or hits and not to the aggregation input.
  • Aggregations: We can also specify filters inside aggregations to filter documents inside a bucket.

A very interesting point to note here is that filters are cached and used independent of the context. This means that once you use a filter in a query and reuse the same filter in an aggregation or post filter, the same cache is hit instead of computing the results.

Note

Hence, make sure that you always use a mixture of filters and queries, where constraints are as much moved to filters depending on the situation. This will allow unwanted computation of scores.

主站蜘蛛池模板: 元朗区| 凉城县| 景德镇市| 吴桥县| 富蕴县| 文安县| 会东县| 抚顺市| 丽江市| 赣州市| 寿阳县| 即墨市| 凯里市| 大洼县| 灵丘县| 阿拉善右旗| 图们市| 滨州市| 福州市| 潜江市| 奎屯市| 中卫市| 大宁县| 额济纳旗| 石柱| 桂东县| 峨眉山市| 沙洋县| 长治市| 永靖县| 天祝| 宜都市| 定远县| 大竹县| 合肥市| 双城市| 镇原县| 扶风县| 安阳县| 天津市| 原平市|