官术网_书友最值得收藏!

Searching your documents

To search we use a large set of documents and our interest here lies only in a subset of this document set. This can be based on a set of constraints and conditions. A search does not stop here. You might be interested in getting a snapshot view of your query result. In our case, if the user searches for dell, he/she might be interested in seeing different unique product types and their document count. This is called an aggregation. Through this, we enhance our search experience and make it more explorable. Here, we try to discover various querying options through which we can express our requirement and communicate the same to Elasticsearch.

In our search application, we expose a search box that can be used for a search. We abstract out information about which field is searched or what is the precedence of the fields that we search. Let's see the query types that would be best for this search box.

A match query

A match query is the ideal place to start your query. It can be used to search many field types, such as a number, string, or even date. Let's see how we can use this to provide the search input box. Let's assume that a user fired a search against the keyword laptop. It does make sense to search on the field's name and description for this keyword and there is no sense to do the same for price or date fields.

Note

Elasticsearch, by default, stores an additional search field called _all, which is an aggregated field over all the field values in that document. Hence, to do a document-level search, it's good to use _all.

A simple match query in all the fields for a word laptop is as follows:

{
  "query": {
    "match": {
      "_all": "laptop"
    }
  }
}

Wait, won't we be using the _all search on the date and price fields too? Which we don't intent to… not in this case. Remember, we search include_in_all as false for all fields other than the name and description fields. This will make sure that these field values won't flow to _all.

Sweet, we are able to search on the string fields that make sense to us and we get neat results. However, now, the requirement from the management has changed. Rather than treating the name field and description field with equal precedence, I would rather like to give weightage to the name field over description. This means that for a document match, if the word is present in the name field, make that document more relevant over a document, where the match that worked is only on the field description. Let's see how we can achieve it using a variance of a match query.

Multifield match query

A multifield match query has the provision to search on multiple fields rather than a single field. Wait, it doesn't stop here. You can also give precedence or importance to each field along with it. This helps us to tell Elasticsearch to treat certain field matches better than others:

{
    "query": {
        "multi_match": {
          "query": "laptop",
          "fields": [
          "name^2",
          "description"
          ]
        }
    }
}

Here, we ask Elasticsearch to match the word laptop on both the field name and description, but give greater relevancy to a match on the field name over a match on description field.

Let's consider the following documents:

  • Document A:
    • Name: Lenovo laptop
    • Description: This is a great product with very high rating from Lenovo
  • Document B:
    • Name: Lenovo bags
    • Description: These are great laptop bags with very high rating from Lenovo

A search on the word laptop will yield a better match on Document A rather than Document B, which makes perfectly good sense in the real-world scenario.

主站蜘蛛池模板: 且末县| 东丽区| 宜宾县| 枣强县| 军事| 天等县| 蓬溪县| 城口县| 太和县| 绥化市| 石嘴山市| 庄浪县| 涟源市| 杭锦旗| 翁源县| 南安市| 濮阳市| 乌拉特前旗| 苏州市| 华坪县| 射洪县| 文化| 齐河县| 洛川县| 海淀区| 庆元县| 中方县| 惠水县| 聂荣县| 应用必备| 广丰县| 遵义市| 正蓝旗| 西丰县| 遵化市| 重庆市| 遂溪县| 辽源市| 滦平县| 阿拉善左旗| 鄂托克前旗|