Hi, have a weird issue with synonyms along with a unique token filter that I cannot get my head around.
MVP:
Settings:
{
"settings": {
"index": {
"analysis": {
"filter": {
"synonym": {
"type": "synonym_graph",
"synonyms_path": "synonyms/synonyms.txt",
"updateable": true
}
},
"analyzer": {
"synonym": {
"tokenizer": "standard",
"filter": [
"synonym",
"unique"
]
}
}
}
}
}
}
synonyms.txt
billie jo wilsson,billiejo,billie-jo,billiejo wilson,billie-jo wilson,billiejo wilsson,billie-jo wilsson,billiejoo,billie jo
Mappings:
{
"properties": {
"description": {
"type": "text",
"index": true
}
}
}
Documents:
{
"description": "billie-jo wilsson"
}
Query:
{
"query": {
"multi_match": {
"query": "billie-jo",
"fields": ["description"],
"type": "cross_fields",
"analyzer": "synonym",
"operator": "AND",
"boost": 0.4
}
}
}
Doing this query yields a hit on the document indexed which is expected, but rolling the synonyms one step to the left (ie. moving the first synonym last in row):
billiejo,billie-jo,billiejo wilson,billie-jo wilson,billiejo wilsson,billie-jo wilsson,billiejoo,billie jo,billie jo wilsson
... then no hits are returned. Why is that? Is the order of the synonyms of importance ?
However if I remove the unique
analyzer filter then the query starts to work again even with the rolled synonyms.
Is this behaviour to be expected for some reason I cannot understand or is there a synonym issue here?
This is performed in Elasticsearch v7.14