Issue with elasticsearch edge_ngram query

sai_saran · May 1, 2020, 10:54am

I am using edge-gram to search in Elasticsearch. This is the analyse I used

    {
      "settings": {
        "analysis": {
          "analyzer": {
            "my_analyzer": {
              "tokenizer": "my_tokenizer",
              "filter": [
                "lowercase"         
                ]
            }
          },
          "tokenizer": {
            "my_tokenizer": {
              "type": "edge_ngram",
              "min_gram": 2,
              "max_gram": 30,
              "token_chars": [
                "letter",
                "digit"
                ]
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "text": {
            "type": "string",
            "analyzer": "my_analyzer"
          }
        }
      }
    }

The problem is when I search the lesser relevant terms have higher score then the more relevant terms.
Search query:

    {
        "query": {
            "bool": {
                "should": [
                    {
                        "match": {
                            "text": {
                                "query": "demo coffee"
                            }
                        }
                    }
                ]
            }
        }
    }

Response:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": {
          "value": 5,
          "relation": "eq"
        },
        "max_score": 3.0498478,
        "hits": [
          {
            "_index": "test",
            "_type": "_doc",
            "_id": "6",
            "_score": 3.0498478,
            "_source": {
              "text": "Filter Coffee"
            }
          },
          {
            "_index": "test",
            "_type": "_doc",
            "_id": "2",
            "_score": 2.4077744,
            "_source": {
              "text": "Demo Tea"
            }
          },
          {
            "_index": "test",
            "_type": "_doc",
            "_id": "1",
            "_score": 2.3014567,
            "_source": {
              "text": "Demo Coffee"
            }
          },
          {
            "_index": "test",
            "_type": "_doc",
            "_id": "4",
            "_score": 1.4384104,
            "_source": {
              "text": "Coffee  Day"
            }
          },
          {
            "_index": "test",
            "_type": "_doc",
            "_id": "7",
            "_score": 0.8630463,
            "_source": {
              "text": "Demo"
            }
          }
        ]
      }
    }

Why the "demo coffee" is ranked lower then the other results , even though it has exact match ?

system · May 29, 2020, 10:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch - how to make shorter phrase more relevant in result Elasticsearch	2	624	September 13, 2019
Relevant data not coming in elasticsearch Elasticsearch	2	265	June 28, 2021
How to make shorter (closer) token match more relevant? (edge_ngram) Elasticsearch	1	289	October 13, 2020
Edge-ngram and irrelevant scoring Elasticsearch	2	564	March 27, 2020
Elasticsearch highlighting on ngram filter is wrwong if min_gram is set to 1 Elasticsearch	2	773	July 6, 2017

Issue with elasticsearch edge_ngram query

Related topics