we have a field in our index that is analyzed with the Beider-Morse phonetic filter and when we query this field we sometimes get some very strange matches.
For example, if you search for "Heine" you will find "Chatten" as a phonetic match. I believe there isn't a serious argument to be made that those two are phonetically similar, no matter what language you consider.
The reason this is considered a match seems to lie in the phonetic synonyms that the original terms are transformed into. Both "Heine" and "Chatten" are transformed into around a dozen phonetic synonyms and there is only one overlap, one synonym that is assigned to both (the synonym "xan"). So, 1 out of 12 is not a really good match.
I don't have the expertise to determine if the transformation into the synonyms makes sense or not. Thats why my first instinct was to "solve" this problem by introducing a minimum-should-match clause, with the intention that it should not be enough for a single synonym to match. i planned to play around with some values to get a feel what would be a good compromise.
But i didn't get that far, because minimum-should-match doesn't seem to work with a phonetic match query.
this is what my query looks like. there are usually a lot more subqueries for other fields that i removed for the sake of clarity/simplicity, thats why there is a nested bool-query that seems obsolete in this simplified example, just so you know:
{
"query": {
"bool": {
"filter": [
{
"term": {
"company": {
"value": "0"
}
}
},
{
"term": {
"accountNo": {
"value": "80529335"
}
}
}
],
"should": [
{
"bool": {
"should": [
{
"match": {
"address.street": {
"query": "Heinestr.",
"minimum_should_match": "3<75%"
}
}
}
],
"minimum_should_match": "1"
}
}
],
"minimum_should_match": "100%"
}
}
}
i tried every conceivable value for "minimum_should_match": "3<75%", but it doesn't seem to have any impact on the result at all as far as i can tell.
my expectation would have been that when setting this to a value >1 the match of a single synonym would no longer be enough to get a match.
any ideas how i could achieve this?
thanks in advance!
regards
Mario K.