I'm looking for a complete example showing how you can effectively perform fuzzy phrase matching, to get useful results as a user types text.
So far, the closest I've got is splitting the search query string into words and creating one clause per word in a span_near query. It doesn't work all that well, unfortunately, specifically when someone has typed only one or two characters in the second term. I've pasted the query below.
{
"query": {
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"name": {"fuzziness": 2, "value": "word1"}
}
}
}
},
{
"span_multi": {
"match": {
"fuzzy": {
"name": {"fuzziness": 2, "value": "partial2"}
}
}
}
}
],
"slop": 3,
"in_order": true
}
}
}
I have strings "Mexico", "Mexico City", "Kuwait", and "Kuwait City" in my index. When someone searches for "Kuwait C" I expect that "Kuwait City" score the highest, but in fact the query returns zero results for "Kuwait C" and "Kuwait Ci" (not even just "Kuwait"). If the user types "Kuwait" or "Kuwait Cit" I get the result I expect ("Kuwait" at the top for the former, "Kuwait City" at the top for the latter). FWIW, "name" is simply a "text" type in my mapping.
Naturally I've also tried something more simple like:
{
"query": {
"match": {
"name": {
"query": "Kuwait C",
"fuzziness": 2
}
}
}
}
but that returns "Kuwait" above "Kuwait City", followed by a bunch of zero-score non-matches. This is marginally better because a) at least it returns something and b) I can filter out the zero-score results. However, it means the user won't start seeing expected results until they type some more characters. I could perform my own post-search processing to sort the better match to the top but I feel like that's going to just lead to more trouble down the road, mixing ES and my own rules.
For what it is worth, I've also tried the "search_as_you_type" mapping type, and the results were the same. I've tried using fuzzy for the first clause and prefix for the second but I get zero results -- probably an entirely wrong approach.
I'm stumped. What's the secret? I am running 7.13.0.