We saw some too_complex_to_determinize_exception when doing fuzz search.
Exception details:
too_complex_to_determinize_exception: Determinizing automaton with 41207 states and 74541 transitions would result in more than 10000 states., ES Status: 500\
Following is the example query DSL which caused the exception:
"query": {
"multi_match": {
"query": "nous savons que vous voulez vraiment commencer dès maintenant, mais vous allez devoir patienter un peu. recherchez dans le windows store la date de lancement.",
"fuzziness": "AUTO",
"fields": ["Term"]
}
}
Basically, the exception occurred when the query is a long length query.
But if I removed the "fuzziness" from the query DSL, then no exception.
And following is the analysis settings:
{
"analysis": {
"filter": {
"whitespace_normalization": {
"pattern":"\\s+",
"type":"pattern_replace",
"replacement":""
}
},
"analyzer": {
"keyword_ngram_suggest": {
"filter": [
"lowercase",
"whitespace_normalization",
"ngram_filter"
],
"type":"custom",
"tokenizer":"keyword"
},
"lowercase_norm_keyword": {
"filter": [
"lowercase",
"whitespace_normalization",
"trim"
],
"type":"custom",
"tokenizer":"keyword"
}
}
}
}
And field mappings:
"Term": {
"type": "text",
"analyzer": "keyword_ngram_suggest",
"search_analyzer": "lowercase_norm_keyword"
}
I guess the reason is because in my setting, the whole input query is treated as a single token, and fuzz search cannot handle long length token well.
My question is, is there any soft limit setting which I can increase the threshold so that allow longer token when dong fuzz search?