Edge-ngram and irrelevant scoring

slim-souissi · February 28, 2020, 9:51am

Hello ,
I'm applying edge-ngram as index toxenizer . My problem is that when a term matches , it matches so many times on the same field . As a consequence , the score become too high and irrelevant.
Here is an example : when i search for the word "Hystorique" i match the value "Consultation_Hystorique_Clients_Recherches" which is correct . but here is the highlights :

"highlight": {
                        "data.content": [
                            "Consultation_<em>Hys</em><em>t</em><em>o</em><em>r</em><em>i</em><em>q</em><em>u</em><em>e</em>_Clients_Recherches"
                        ]
>                     }.

here is my analyzer:

"analysis": {
          "analyzer": {
            "autocomplete": {
              "tokenizer": "autocomplete",
              "filter": ["lowercase"]
            }
          },
          "tokenizer": {
            "autocomplete": {
              "type": "edge_ngram",
              "min_gram": 3,
              "max_gram": 50,
              "token_chars": [
                "letter","digit","letter_number","uppercase_letter","line_separator"
              ],"custom_token_chars": ["_","-"]
            }
          }
        }
      },

thanks in advance.

dadoonet · February 28, 2020, 12:38pm

I think you should define in the mapping a simple search_analyzer for your field.
This might help.

I shared an example at the very end of this gist:

gist.github.com

https://gist.github.com/dadoonet/f911291c4dd19b0802031db3064c648f

bbl.json

GET /

#### 1ST PART CRUD
DELETE villes

# Create the first doc
PUT villes/_doc/cergy
{
  "message": "Ici c'est Cergy"
}

This file has been truncated. show original

system · March 27, 2020, 12:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue with elasticsearch edge_ngram query Elasticsearch	1	343	May 29, 2020
Edge-Ngram returns irrelevant result Elasticsearch	11	1589	July 6, 2017
How to make shorter (closer) token match more relevant? (edge_ngram) Elasticsearch	1	289	October 13, 2020
Scoring autcomplete (edgeNGram) results Elasticsearch es-hadoop	5	1475	July 6, 2017
Elasticsearch - how to make shorter phrase more relevant in result Elasticsearch	2	624	September 13, 2019

Edge-ngram and irrelevant scoring

Related topics