Performance problem in Percolator fetch phase classes

Hi,

We are using Elasticsearch 7.8.0 and seem to be encountering a performance issue that we traced back to the method PercolatorHighlightSubFetchPhase#locatePercolatorQuery. Lucene 8.6 introduced a change to QueryVisitor as well as TermInSetQuery [1] that results in visitLeaf being called for each matching term in the index; this can become very costly even during "non-percolating" queries, and at this point PercolatorHighlightSubFetchPhase is only figuring out whether to run at all [2]. In addition to that, PercolatorMatchedSlotSubFetchPhase re-uses this very same method [3].

Did anyone else encounter similar problems, is this worthy of a bug report on GitHub?

Regards,
András

[1] https://github.com/apache/lucene-solr/pull/1465
[2] https://github.com/elastic/elasticsearch/blob/master/modules/percolator/src/main/java/org/elasticsearch/percolator/PercolatorHighlightSubFetchPhase.java#L56
[3] https://github.com/elastic/elasticsearch/blob/master/modules/percolator/src/main/java/org/elasticsearch/percolator/PercolatorMatchedSlotSubFetchPhase.java#L66

2 Likes

Correction: the linked Lucene PR might help resolving this problem, as it changes TermInSetQuery so that it resolves terms lazily when visited by a QueryVisitor.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.