官术网_书友最值得收藏!

Choosing between a query and a filter

The basic idea of a search is to narrow down on a subset of documents that you have. In the Elasticsearch context, this means that based on various conditions, you might want to select a set of documents from an index or a set of index. A query and filter facilitate this for you.

If you have already gone through the reference guide or some other documentation of Elasticsearch, you might have noticed that the same set of operations might be available for both queries and filters. So, what are the differentiating factors of a query and filter even when the set of operations given by them are almost the same? Let's find out.

In a query, a matched document can be a better match than another matched document. In a filter, all the matched documents are treated equally.

This means that there is a way to score or rank a document matched against a query to another document match. This is done by computing a score value that tells you how good a match a particular document is against a query. If the query is a better match, give a higher score and if it's a lesser match, give a lesser score. This way, we can identify the best matches and use that in the paging.

For an e-commerce site, the success is decided on what percentage of input traffic is converted to purchase. A customer searches for something he/she is interested in buying and if we can't show him the most relevant results in the first page itself, then the chance of converting the search into a purchase would be slim. Mostly, none of the customers would look at the second page or subsequent pages for best options. They will assume that the products in further pages are of lesser importance than the current page and will drop the search there. Hence, we have to use queries to make our result order more relevant to the user.

But wait, what are the advantages of filters? Let's explore them.

Note

Filters don't compute the matched score per document and hence, they are faster. The results are also cached, which means that from the second search, the speed will be really good.

So, for structured searches, such as a date range, number range, and so on, where scoring doesn't come in picture, filters is our man. It has to be noted that filters can be used in many areas. They are:

  • Queries: A filter can be used for querying. Note that like a query has a separate section called query in a Query DSL (domain-specific language), there is no separate section for filters. Rather, you need to embed your filter inside the constant_score query type or the filtered_query type.
  • Scoring: Elasticsearch provides you a query type called the function_score query. Using the capabilities of this query type, we can use a filter and boost the score based on the filter match.
  • A post filter: This is applied to the search results, but not to the input of the aggregation. This means that even though the scope of aggregation is its query, we can modify this behavior by adding the post filter. Post filters are only applied to search results or hits and not to the aggregation input.
  • Aggregations: We can also specify filters inside aggregations to filter documents inside a bucket.

A very interesting point to note here is that filters are cached and used independent of the context. This means that once you use a filter in a query and reuse the same filter in an aggregation or post filter, the same cache is hit instead of computing the results.

Note

Hence, make sure that you always use a mixture of filters and queries, where constraints are as much moved to filters depending on the situation. This will allow unwanted computation of scores.

主站蜘蛛池模板: 灌南县| 朝阳市| 阿克| 玉田县| 成武县| 铜梁县| 平罗县| 肃宁县| 河曲县| 广灵县| 沈阳市| 界首市| 桐城市| 临沂市| 孝感市| 青阳县| 秭归县| 长葛市| 桂平市| 察隅县| 象州县| 舒兰市| 喀喇沁旗| 阜南县| 昭通市| 绩溪县| 黔江区| 沂南县| 加查县| 龙海市| 昭苏县| 晋州市| 莱芜市| 中山市| 深水埗区| 临高县| 山阳县| 惠安县| 汕头市| 左权县| 东阳市|