- Elasticsearch Blueprints
- Vineeth Mohan
- 632字
- 2021-07-16 13:39:32
Searching your documents
To search we use a large set of documents and our interest here lies only in a subset of this document set. This can be based on a set of constraints and conditions. A search does not stop here. You might be interested in getting a snapshot view of your query result. In our case, if the user searches for dell
, he/she might be interested in seeing different unique product types and their document count. This is called an aggregation. Through this, we enhance our search experience and make it more explorable. Here, we try to discover various querying options through which we can express our requirement and communicate the same to Elasticsearch.
In our search application, we expose a search box that can be used for a search. We abstract out information about which field is searched or what is the precedence of the fields that we search. Let's see the query types that would be best for this search box.
A match query
A match query is the ideal place to start your query. It can be used to search many field types, such as a number, string, or even date. Let's see how we can use this to provide the search input box. Let's assume that a user fired a search against the keyword laptop
. It does make sense to search on the field's name and description for this keyword and there is no sense to do the same for price or date fields.
Note
Elasticsearch, by default, stores an additional search field called _all
, which is an aggregated field over all the field values in that document. Hence, to do a document-level search, it's good to use _all
.
A simple match query in all the fields for a word laptop
is as follows:
{ "query": { "match": { "_all": "laptop" } } }
Wait, won't we be using the _all
search on the date and price fields too? Which we don't intent to… not in this case. Remember, we search include_in_all
as false
for all fields other than the name and description fields. This will make sure that these field values won't flow to _all
.
Sweet, we are able to search on the string fields that make sense to us and we get neat results. However, now, the requirement from the management has changed. Rather than treating the name field and description field with equal precedence, I would rather like to give weightage to the name field over description. This means that for a document match, if the word is present in the name field, make that document more relevant over a document, where the match that worked is only on the field description. Let's see how we can achieve it using a variance of a match query.
Multifield match query
A multifield match query has the provision to search on multiple fields rather than a single field. Wait, it doesn't stop here. You can also give precedence or importance to each field along with it. This helps us to tell Elasticsearch to treat certain field matches better than others:
{ "query": { "multi_match": { "query": "laptop", "fields": [ "name^2", "description" ] } } }
Here, we ask Elasticsearch to match the word laptop
on both the field name and description, but give greater relevancy to a match on the field name over a match on description field.
Let's consider the following documents:
- Document A:
- Name: Lenovo laptop
- Description: This is a great product with very high rating from Lenovo
- Document B:
- Name: Lenovo bags
- Description: These are great laptop bags with very high rating from Lenovo
A search on the word laptop
will yield a better match on Document A rather than Document B, which makes perfectly good sense in the real-world scenario.
- Spring 5.0 Microservices(Second Edition)
- ASP.NET Web API:Build RESTful web applications and services on the .NET framework
- 架構(gòu)不再難(全5冊)
- Mastering Ubuntu Server
- 精通API架構(gòu):設(shè)計(jì)、運(yùn)維與演進(jìn)
- Mastering Scientific Computing with R
- Spring實(shí)戰(zhàn)(第5版)
- AutoCAD VBA參數(shù)化繪圖程序開發(fā)與實(shí)戰(zhàn)編碼
- Python機(jī)器學(xué)習(xí)經(jīng)典實(shí)例
- Apache Spark 2.x for Java Developers
- Microsoft Azure Storage Essentials
- 深度學(xué)習(xí):Java語言實(shí)現(xiàn)
- Mastering openFrameworks:Creative Coding Demystified
- Hands-On GUI Programming with C++ and Qt5
- C++程序設(shè)計(jì)教程