Help to understand fuzzy score

Given the following search:

{
  "size": 100,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "business_names": {
              "query": "my company",
              "operator": "and",
              "fuzziness": "auto"
            }
          }
        }
      ]
    }
  }
}

I see results like MY LITTLE COMPANY with a higher score than documents that match the input exactly.

How can I formulate the query so that results that match the input exactly are at the top of the results?

One Idea is to create another query that matches exactly with a boost, but why that is needed?

You can create a bool query with 2 should clauses. One with a fuzzy search. Another one without.

As the second will match when texts are identical, the score will be higher.

An example of this here:

Thanks, that's what I hinted above

I am still trying to understand why the fuzzy result is scored higher

Is this due to the relevance of the tokens relatively to the index?

Note that it could depend on the size of the field, on the total number of terms in the index, on the number of shards... So many factors.

You can try to understand using "explain": true.

1 Like