Hi, I am using n-gram filter with my index.
Index configuration:
PUT dev-dk-ngram-test
{
"settings": {
"index": {
"max_ngram_diff": 2
},
"analysis": {
"analyzer": {
"default": {
"tokenizer": "whitespace",
"filter": [ "lowercase", "custom_grams" ]
},
"default_search": {
"tokenizer": "whitespace",
"filter": [ "lowercase", "custom_grams" ]
}
},
"filter": {
"custom_grams": {
"type": "ngram",
"min_gram": 3,
"max_gram": 5
}
}
}
},
"mappings": {
"properties": {
"dk-ngram-test": {
"type": "nested",
"include_in_parent": true,
"properties": {
"name": {
"type": "text"
},
"desc": {
"type": "text"
},
"age": { "type": "integer" }
}
}
}
}
}
Them I am adding few records:
PUT dev-dk-ngram-test/_doc/8?refresh
{
"dk-ngram-test": {
"name": "Atlas",
"desc": "Domain",
"age": 1
}
}
PUT dev-dk-ngram-test/_doc/9?refresh
{
"dk-ngram-test": {
"name": "Functional Classification",
"desc": "Functional Classification",
"age": 1
}
}
Now I am searching "Atlas".
GET dev-dk-ngram-test/_search
{
"query": {
"query_string": {
"query": "atlas",
"escape": true
}
}
}
I expect the Document with ID 8 to be of higher sore, but instead the document with ID 9 has higher score.
To give more background, we started with default (standard) settings, but it didn't give us results which had a minor spelling mistakes. So we switched to n-gram. It works well for our most search use cases, but searches like above are causing issue. The term which match exactly are getting lower score at times. Could you please help.