When we are trying to do a regex on around 10 different terms, we get the following exception:
ElasticsearchException[Elasticsearch exception [type=too_complex_to_determinize_exception, reason=too_complex_to_determinize_exception: Determinizing automaton with 67 states and 278 transitions would result in more than 10000 states.]];
Increasing the boost will resolve the error. But the number of terms passed will differ and cant fix a specific value. I am thinking of using wildcard.
Wildcard also internally creates states, so whether using wild card is a better approach or is there a better way?
My sample query using regex:
search {"from":0,"size":0,"timeout":"60s","query":{"bool":{"must":[{"regexp":{"filed_name":
{"value":".*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*|.*term.*","flags_value":65535,"max_determinized_states":10000,"boost":1.0}
}},{"range":{"condition_field":
{"from":0,"to":null,"include_lower":false,"include_upper":true,"boost":1.0}
}},{"range":{"condition_filed":
{"from":0,"to":null,"include_lower":false,"include_upper":true,"boost":1.0}
}}],"adjust_pure_negative":true,"boost":1.0}},"profile":true,"sort":[{"_id":{"order":"asc"}}]}