Hello,
ES newbie here, looking for help in understanding what's wrong.
Let's consider this index mapping, where I define some synonims for motobike models :
{
"settings": {
"analysis": {
"char_filter": {
"replace": {
"type": "mapping",
"mappings": [
"&=> and "
]
}
},
"filter": {
"word_delimiter": {
"type": "word_delimiter",
"split_on_numerics": "false",
"split_on_case_change": "true",
"generate_word_parts": "true",
"generate_number_parts": "true",
"catenate_all": "true",
"preserve_original": "true",
"catenate_numbers": "true"
},
"custom_synonym": {
"type": "synonym",
"lenient": "true",
"synonyms": [
"r 1200 r , r1200 r, r 1200r, r1200r",
"r 1150 r, r1150 r, r 1150r, r 1150 r, r1150r"
]
}
},
"analyzer": {
"default": {
"type": "custom",
"char_filter": [
"html_strip",
"replace"
],
"tokenizer": "whitespace",
"filter": [
"custom_synonym",
"lowercase",
"word_delimiter"
]
}
}
}
},
"mappings": {
"product": {
"properties": {
"pname": {
"type": "text",
"analyzer": "default"
}
}
}
}
}
If I put two documents in the index :
PUT test_index/product/1
{
"pname" : "MOTORBIKE BMW R 1150 R"
}
PUT test_index/product/2
{
"pname" : "MOTORBIKE BMW R 1200 R"
}
And then perform a match query like :
GET test_index/_search
{
"query": {
"match" : {
"pname" : "MOTORBIKE R1200R"
}
}
}
I get both hits with the same score :
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_index",
"_type" : "product",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"pname" : "MOTORBIKE BMW R 1200 R"
}
},
{
"_index" : "test_index",
"_type" : "product",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"pname" : "MOTORBIKE BMW R 1150 R"
}
}
]
}
}
My expectation was to have a bigger score on the "MOTORBIKE BMW R 1200 R" document since I have defined a synonim for the "r1200r" term : ( r 1200 r , r1200 r, r 1200r, r1200r ).
Any clue ?