- Elasticsearch Blueprints
- Vineeth Mohan
- 651字
- 2021-07-16 13:39:32
Choosing between a query and a filter
The basic idea of a search is to narrow down on a subset of documents that you have. In the Elasticsearch context, this means that based on various conditions, you might want to select a set of documents from an index or a set of index. A query and filter facilitate this for you.
If you have already gone through the reference guide or some other documentation of Elasticsearch, you might have noticed that the same set of operations might be available for both queries and filters. So, what are the differentiating factors of a query and filter even when the set of operations given by them are almost the same? Let's find out.
In a query, a matched document can be a better match than another matched document. In a filter, all the matched documents are treated equally.
This means that there is a way to score or rank a document matched against a query to another document match. This is done by computing a score value that tells you how good a match a particular document is against a query. If the query is a better match, give a higher score and if it's a lesser match, give a lesser score. This way, we can identify the best matches and use that in the paging.
For an e-commerce site, the success is decided on what percentage of input traffic is converted to purchase. A customer searches for something he/she is interested in buying and if we can't show him the most relevant results in the first page itself, then the chance of converting the search into a purchase would be slim. Mostly, none of the customers would look at the second page or subsequent pages for best options. They will assume that the products in further pages are of lesser importance than the current page and will drop the search there. Hence, we have to use queries to make our result order more relevant to the user.
But wait, what are the advantages of filters? Let's explore them.
Note
Filters don't compute the matched score per document and hence, they are faster. The results are also cached, which means that from the second search, the speed will be really good.
So, for structured searches, such as a date range, number range, and so on, where scoring doesn't come in picture, filters is our man. It has to be noted that filters can be used in many areas. They are:
- Queries: A filter can be used for querying. Note that like a query has a separate section called query in a Query DSL (domain-specific language), there is no separate section for filters. Rather, you need to embed your filter inside the
constant_score
query type or thefiltered_query
type. - Scoring: Elasticsearch provides you a query type called the
function_score
query. Using the capabilities of this query type, we can use a filter and boost the score based on the filter match. - A post filter: This is applied to the search results, but not to the input of the aggregation. This means that even though the scope of aggregation is its query, we can modify this behavior by adding the post filter. Post filters are only applied to search results or hits and not to the aggregation input.
- Aggregations: We can also specify filters inside aggregations to filter documents inside a bucket.
A very interesting point to note here is that filters are cached and used independent of the context. This means that once you use a filter in a query and reuse the same filter in an aggregation or post filter, the same cache is hit instead of computing the results.
Note
Hence, make sure that you always use a mixture of filters and queries, where constraints are as much moved to filters depending on the situation. This will allow unwanted computation of scores.
- 極簡算法史:從數(shù)學(xué)到機器的故事
- Learning NServiceBus(Second Edition)
- Beginning Java Data Structures and Algorithms
- ASP.NET動態(tài)網(wǎng)頁設(shè)計教程(第三版)
- Hadoop+Spark大數(shù)據(jù)分析實戰(zhàn)
- Java EE 7 Development with NetBeans 8
- Elasticsearch for Hadoop
- Responsive Web Design by Example
- 精通Linux(第2版)
- jQuery Mobile移動應(yīng)用開發(fā)實戰(zhàn)(第3版)
- Learning Node.js for .NET Developers
- Python 3.7從入門到精通(視頻教學(xué)版)
- Android應(yīng)用開發(fā)攻略
- Pandas入門與實戰(zhàn)應(yīng)用:基于Python的數(shù)據(jù)分析與處理
- Mastering MeteorJS Application Development