Elasticsearch Query for good title keyword results

We have a elasticsearch index containing a catalog of products, that we want to search by title and description.

We want it to have the following constraints:

  • We are searching title and description for occurences (matches in title should be twice as important as description)
  • We want it to have a very light fuzzy search result (but still accurate results)
  • Not matching results to the searchterm should not be filtered out, but only shown later (so matching results should be on top and worse results should be at the bottom)
  • category_id should filter products out (so no results of other categories should be shown)
  • The created_at attribute should be valued very high in sorting as well.
    products should lose score the "older" they get. (This is very important, because they lose importance with every day)

I have tried to create a query like that, but the results are really less than accurate. Sometimes finding completely unrelated stuff. I think that's because of the wildcard query.

Also I think there must be a more elegant solution for the "created_at" scoring. Right?

I am using Elasticsearch 6.2

This is my current code. I would be happy to see an elegant solution for this:

    {
      "sort": [
        {
          "_score": {
            "order": "desc"
          }
        }
      ],
      "min_score": 0.3,
      "size": 12,
      "from": 0,
      "query": {
        "bool": {
          "filter": {
            "terms": {
              "category_id": [
                "212",
                "213"
              ]
            }
          },
          "should": [
            {
              "match": {
                "title_completion": {
                  "query": "Development",
                  "boost": 20
                }
              }
            },
            {
              "wildcard": {
                "title": {
                  "value": "*Development*",
                  "boost": 1
                }
              }
            },
            {
              "wildcard": {
                "title_completion": {
                  "value": "*Development*",
                  "boost": 10
                }
              }
            },
            {
              "match": {
                "title": {
                  "query": "Development",
                  "operator": "and",
                  "fuzziness": 1
                }
              }
            },
            {
              "range": {
                "created_at": {
                  "gte": 1563264817998,
                  "boost": 11
                }
              }
            },
            {
              "range": {
                "created_at": {
                  "gte": 1563264040398,
                  "boost": 4
                }
              }
            },
            {
              "range": {
                "created_at": {
                  "gte": 1563256264398,
                  "boost": 1
                }
              }
            }
          ]
        }
      }
    }

Hi Simon,

I think the unrelated stuff is because you made the range queries part of the same should clause as the text search - effectively asking for "stuff that matches my search terms OR recent stuff".

You should put the text criteria in a must clause. Roughly speaking:

bool
    filter
        category selections
    must
        bool
           should
                 match field X with search term
                 match field Y with search term
    should
        recency boosts

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.