Fuzziness support in intervals query?

ES 7.0 introduced support for the Lucene intervals query which is more powerful and easier to deal with than span queries (analyzer support, etc). However, one thing that span queries support and intervals doesn't is fuzziness. Since the intervals query is supposed to help in legal and patent search, I have a hard time understanding how this could be possible without fuzziness support.

I could not find much information about that when browsing the Github issues. Is there any reason why the intervals query doesn't support fuzziness? (because Lucene doesn't maybe)? Is it on the roadmap?

Here was a another sample case where fuzziness support in intervals query could come in handy.

Hello, Val.

After https://issues.apache.org/jira/browse/LUCENE-9028 it's comparatively easy to make fuzzy intervals in Elastic. Although, there is one severe lack in intervals at comparison to spans.

@Adrien_Grand I'd really appreciate if you could provide some insights. Thank you very much in advance!

@Mikhail_Khludnev, @Adrien_Grand, I totally agree with @val. Even I need to use Fuzzy support with interval queries. Could you please suggest something?

I think it's worth opening an issue in Elasticsearch and we'll discuss where the support should land (Elasticsearch or Lucene). As @Mikhail_Khludnev suggested it should be easy to make fuzzy intervals in Elasticsearch using the MultiTermIntervalsSource except that it is not exposed :wink:. Queries that need to check positions cannot handle large number of multi-terms so we have some logic to restrict to those that expand to less than a provided threshold (bounded to 1024). With this protection in place I don't see why we should not expose them more simply in Lucene.

1 Like

@jimczi
Thank you so much for shedding some light on this! That's good news!

As suggested, I've created the new issue: https://github.com/elastic/elasticsearch/issues/49595

1 Like