How to make shorter (closer) token match more relevant? (edge_ngram)

Hello,

I'm getting very weird results with edge_ngram tokenizer I'm using for autocomplete. I'm trying to figure out how to make my results more relevant. I copied the example from the elasticsearch documentation.

I have documents with the following descriptions:

  • "Apples, raw, without skin"
  • "Apples, raw, golden delicious, with skin",
  • "APPLEBEE'S, chili"
  • "Babyfood, fruit, applesauce, junior"

If i search for apple, "APPLEBEE'S, chili" will get higher score than "Apples, raw, without skin"
If i search for apples, "Babyfood, fruit, applesauce, junior" will get higher score than "Apples, raw, golden delicious, with skin"

In both cases I would like to have higher score for the more relevant closer/shorter match (ie. apples when I search for apple or apples

My settings are:

"settings": {
  "analysis": {
    "analyzer": {
      "autocomplete": {
        "tokenizer": "autocomplete",
        "filter": [
          "lowercase",
          "asciifolding"
        ]
      },
      "autocomplete_search": {
        "tokenizer": "lowercase"
      }
    },
    "tokenizer": {
      "autocomplete": {
        "type": "edge_ngram",
        "min_gram": 2,
        "max_gram": 20,
        "token_chars": [
          "letter"
        ]
      }
    }
  }
},

query:

"query": {
    "match": {
      "description": {
          "query": "apple", 
          "operator": "and"
        }
    }
  }

What do I have to do to get the more relevant results score higher?

Thanks,
Gabor

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.