In the spirit of teaching a man to fish [1] let's walk through what's going on here.
First we can see what query is being executed once it has been parsed and rewritten for execution using the explain
API on the document ID you expect to match:
GET fuzzytest/tip/1/_explain
{
"query" : {
"query_string" : {
"query" : "25r20~",
"fields" : [ "code" ]
}
}
}
This reveals that the query is devoid of clauses (we have an empty query):
"description": "no match on required clause (MatchNoDocsQuery(\"empty BooleanQuery\"))",
What this tells us is that the fuzzy query produced no terms at all. So let's look at what is actually in the index for the term in question. We'll use the analyze
api:
GET fuzzytest/_analyze
{
"analyzer":"my_identifier_analyzer",
"text":"335/25R20"
}
This shows us the terms in the index:
"tokens": [
{
"token": "335/25r20",
},
{
"token": "335",
},
{
"token": "33525",
},
{
"token": "33525r20",
},
{
"token": "25",
},
{
"token": "r",
},
{
"token": "20",
}
]
Let's fix that. The correct place to put the fuzziness setting according to the docs [2] is so that should be:
GET fuzzytest/tip/1/_explain
{
"query": {
"query_string": {
"query": "25r20~",
"fuzziness": 2,
"fields": [
"code"
]
}
}
}
Sadly that still does not match and increasing the fuzziness setting does not alter this. The reason is that the token being "fuzzied" is 25r20
which is not being fed through your analyzer. If you reconsider the tokens we saw in the index none of them are within 2 edit distances (the maximum edits allowed) of 25r20.
The irony is that the non-fuzzy version of your query is better at fuzzy matching because it uses the tokenization policy for splitting 25r20 into multiple tokens whereas the query_string parser with fuzzy assumes that all non-whitespace text prior to the ~ character (25r20) is what should be edit-distance matched rather than just the last token (20).
It's a messy world, huh?
[1] "Give a man a fish, feed him for a day. Teach a man to fish, feed him for a lifetime."
[2] Common options | Elasticsearch Guide [5.0] | Elastic