ES 7.7.1 -> ES 7.12.0 WAND Performance Issue

Hello,

I am upgrading an existing cluster from ES 7.7.1 (lucene 8.5.1) to ES 7.12.0 (lucene 8.8.0) and have hit a massive performance regression in basic match queries utilizing minimum should match. I believe this is due to LUCENE-9346 which should have improved performance.

My queries are simple match queries against a single field, OR operator, and a minimum should match value usually between 20-50%. The search text ranges from 10 to 50 tokens after analysis. In ES 7.12.0 these queries are running between 4-20x slower than they did in ES 7.7.1. If I remove the minimum should match parameter they run ~2-3x FASTER than in ES 7.7.1 without the minimum should match value but still about 3x slower than ES 7.7.1 with minimum should match.

Here sample took times running the same query against the different es version with and without min should match parameter. Same cluster + index specs, same exact data, fully optimized single segment index.

ES 7.7.1 with min should match: 50ms
ES 7.12.0 with min should match: 1022ms
ES 7.7.1 without min should match: 350ms
ES 7.12.0 without min should match: 167ms

In general looks like WAND scoring has been improved but the min should match support introduced in LUCENE-9346 might have made things significantly worse.

Any ideas? I have hot threads output and it pretty clearly shows the time is spent in WANDScorer and DisiPriorityQueue:

       app//org.apache.lucene.search.DisiPriorityQueue.upHeap(DisiPriorityQueue.java:135)
       app//org.apache.lucene.search.DisiPriorityQueue.add(DisiPriorityQueue.java:103)
       app//org.apache.lucene.search.WANDScorer.pushBackLeads(WANDScorer.java:316)
       app//org.apache.lucene.search.WANDScorer.doNextCompetitiveCandidate(WANDScorer.java:456)
       app//org.apache.lucene.search.WANDScorer.access$400(WANDScorer.java:46)
       app//org.apache.lucene.search.WANDScorer$1.advance(WANDScorer.java:267)
       app//org.apache.lucene.search.WANDScorer$1.nextDoc(WANDScorer.java:244)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:262)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:239)
       app//org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:45)
       app//org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)

Cluster specs:
12 - c5d.4xlarge data nodes with 16g heaps
~440M docs, 36 shards, 1 replica, index fully optimized
Single field indexed / searched with index_options freqs

cc @jpountz

Thanks,
Matt Weber

2 Likes

Thanks for reporting this, I'll file a Lucene JIRA since this is most likely a Lucene issue.

Since you say that your queries are simple match queries, am I right to assume that there is no filtering involved, only a top-level match query that requires 20%-50% of the 10-50 SHOULD clauses to match?

I opened https://issues.apache.org/jira/browse/LUCENE-9958.

1 Like

You found the issue and fixed it as well! Thank you @jpountz!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.