Stop words filter works non-deterministically

Hi colleagues,

I have following filter and custom analyzer:

"custom_address_stopwords_filter": {
"type": "stop",
"stopwords": [
"po"
,
"p.o."
,
"box"
,
"n\a"
,
"n\a"
,
"n/a"
,
"n/a"
,
"-"
,
"-----"
,
"none"
,
"TBD"
]
},


"address_text_transliteration_analyzer": {
"filter": [
"icu_folding",
"custom_address_stopwords_filter"
],
"type": "custom",
"tokenizer": "icu_tokenizer"
},

I have 2 cases that I am expecting that will work in the same way but they are not.

Case 1:
In elastic search, I have indexed an entry with a property "address" for which I applied mentioned analyzer:

"address": "none"

when I am querying data providing also "none" as address it matches which for me works as expected.

Case 2:
In elastic search, I have indexed an entry with:
"address": "TBD"

when I am querying data providing "none" as address it does not match. Both "TBD" and "none" are stopwords and from my understanding, there should be removed, then compared, so in both cases there should be an empty list of tokens that should always match but it is not.

Could you please explain what I am doing wrong?
Ewelina

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.