Edit Distance > 2 - Is It Really Unconfigurable?

I'm facing the same problem mentioned in this thread - I need to use an edit distance greater than 2 for fuzzy searches, but Elasticsearch doesn't seem to support this.

I understand the performance concerns with higher Levenshtein distance values, but for my specific use case with small datasets, this isn't a concern. What surprised me is that this appears to be completely unconfigurable - I can't find any setting to adjust this limit.

Is this limitation truly hard-coded? If so, where I can find this value in the source code to change and build a custom version?

Have a look at:

Thank you David,

I'v tried to increase the fuzzy edit distance limit beyond 2 by modifying the Lucene source code. Here's what I've done so far:

  1. Changed MAXIMUM_SUPPORTED_DISTANCE from 2 to 5 in the Lucene core

  2. Recompiled the lucene-core library

  3. Replaced the JAR file in Elasticsearch's lib directory

  4. Verified via decompilation that the change is present in the JAR

However, I'm still getting the same error when trying to use edit distances greater than 2. This suggests there might be additional validation happening elsewhere in the codebase that's enforcing this limit.

{
  "error": {
"root_cause": [
      {
"type": "illegal_argument_exception",
"reason": "Valid edit distances are [0, 1, 2] but was [3]"
      }
    ],
"type": "illegal_argument_exception",
"reason": "Valid edit distances are [0, 1, 2] but was [3]"
  },
"status": 400
}

Are there other places in Elasticsearch or Lucene where this validation might be occurring? I suspect there might be another layer of checks beyond the MAXIMUM_SUPPORTED_DISTANCE constant.

Environment:

  • Elasticsearch version: 9.2.1

  • Lucene version: 10.3.1

This blog offers a glimpse into why it is not just a case of bumping a value:

1 Like