Elastic 8.10 introduced a major change QueryPhaseCollector (see https://github.com/elastic/elasticsearch/pull/97410) in how lucene queries are executed. I have observed that now, when using min_score, each document gets scored twice! That does NOT happen when min_score is not used, and this did NOT happen on previous versions of Elasticsearch.
Aside from being terribly inefficient, this actually creates problems for certain low level query types (Span) which make use of "positions" and would give different results when a document is scored repeatedly. I struggled to document a clear example using this exact Span behavior, but we have custom Lucene queries that are failing due to this "invoke score twice" issue. So despite not demonstrating true query failure, I can demonstrate that "score is invoked twice" and hopefully you agree that alone is bad / needs fixing.
Here is an easy way for you to reproduce the "invoke score twice":
- create a new index with just 1 document
POST /testing/_doc
{
"message": "hello"
}
-
Set a debugger breakpoint on org.apache.lucene.search.TermScorer.score()
-
Execute this query
POST /testing/_search
{
"min_score": 0.001,
"query": {
"term": {
"message": "hello"
}
}
}
-
Observe the breakpoint gets hit 2 times for this same document (there is only 1 doc in the index), see below for the stacktrace. This is NOT the expected behavior. Prior versions of Elastic call score() only once as expected (see below for identical test stacktrace in ES 8.6.2). Comparing these stacktraces, I see that ES 8.6.2 used an object called ScoreCachingWrappingScorer which prevents the lower level score() method from being called a second time. However ES8.10.x (also ES8.11.0) no longer has the ScoreCachingWrappingScorer object at all (due to the new QueryPhaseCollector logic) so I believe this is a new "bug" in Elasticsearch.
-
Remove the min_score clause from the query and repeat, this time observe the breakpoint gets hit only 1 time (as expected). So this issue is caused by the use of min_score.
Stacktrace in ES 8.11.0
(First call to score within the single request)
Thread [elasticsearch[singlenode][search_worker][T#17]] (Suspended (breakpoint at line 85 in TermScorer))
TermScorer.score() line: 85
QueryPhaseCollector.isDocWithinMinScore(Scorable) line: 122
QueryPhaseCollector.shouldCollectTopDocs(int, Scorable, Bits) line: 118
QueryPhaseCollector$TopDocsLeafCollector.collect(int) line: 236
Weight$DefaultBulkScorer.scoreRange(LeafCollector, DocIdSetIterator, TwoPhaseIterator, DocIdSetIterator, Bits, int, int) line: 277
Weight$DefaultBulkScorer.score(LeafCollector, Bits, int, int) line: 236
CancellableBulkScorer.score(LeafCollector, Bits, int, int) line: 45
CancellableBulkScorer(BulkScorer).score(LeafCollector, Bits) line: 38
ContextIndexSearcher.searchLeaf(LeafReaderContext, Weight, Collector) line: 536
ContextIndexSearcher.search(List<LeafReaderContext>, Weight, Collector) line: 460
ContextIndexSearcher.lambda$search$4(AtomicInteger, LeafReaderContext[], Weight, Collector, List) line: 375
0x0000000100d0dde0.call() line: not available
ContextIndexSearcher$1(FutureTask<V>).run() line: 317
TimedRunnable.doRun() line: 33
ThreadContext$ContextPreservingAbstractRunnable.doRun() line: 983
ThreadContext$ContextPreservingAbstractRunnable(AbstractRunnable).run() line: 26
TaskExecutionTimeTrackingEsThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1144
ThreadPoolExecutor$Worker.run() line: 642
Thread.runWith(Object, Runnable) line: 1596
Thread.run() line: 1583
(Second call to score within the same request as above)
Thread [elasticsearch[singlenode][search_worker][T#17]] (Suspended (breakpoint at line 85 in TermScorer))
TermScorer.score() line: 85
TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(int) line: 73
QueryPhaseCollector$TopDocsLeafCollector.collect(int) line: 242
Weight$DefaultBulkScorer.scoreRange(LeafCollector, DocIdSetIterator, TwoPhaseIterator, DocIdSetIterator, Bits, int, int) line: 277
Weight$DefaultBulkScorer.score(LeafCollector, Bits, int, int) line: 236
CancellableBulkScorer.score(LeafCollector, Bits, int, int) line: 45
CancellableBulkScorer(BulkScorer).score(LeafCollector, Bits) line: 38
ContextIndexSearcher.searchLeaf(LeafReaderContext, Weight, Collector) line: 536
ContextIndexSearcher.search(List<LeafReaderContext>, Weight, Collector) line: 460
ContextIndexSearcher.lambda$search$4(AtomicInteger, LeafReaderContext[], Weight, Collector, List) line: 375
0x0000000100d0dde0.call() line: not available
ContextIndexSearcher$1(FutureTask<V>).run() line: 317
TimedRunnable.doRun() line: 33
ThreadContext$ContextPreservingAbstractRunnable.doRun() line: 983
ThreadContext$ContextPreservingAbstractRunnable(AbstractRunnable).run() line: 26
TaskExecutionTimeTrackingEsThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1144
ThreadPoolExecutor$Worker.run() line: 642
Thread.runWith(Object, Runnable) line: 1596
Thread.run() line: 1583
Stacktrace in ES 8.6.2
Thread [elasticsearch[singlenode][search][T#7]] (Suspended (breakpoint at line 74 in TermScorer))
TermScorer.score() line: 74
ScoreCachingWrappingScorer.score() line: 61
MinimumScoreCollector.collect(int) line: 45
Weight$DefaultBulkScorer.scoreRange(LeafCollector, DocIdSetIterator, TwoPhaseIterator, Bits, int, int) line: 274
Weight$DefaultBulkScorer.score(LeafCollector, Bits, int, int) line: 254
CancellableBulkScorer.score(LeafCollector, Bits, int, int) line: 45
CancellableBulkScorer(BulkScorer).score(LeafCollector, Bits) line: 38
ContextIndexSearcher.searchLeaf(LeafReaderContext, Weight, Collector) line: 196
ContextIndexSearcher.search(List<LeafReaderContext>, Weight, Collector) line: 170
ContextIndexSearcher(IndexSearcher).search(Query, Collector) line: 519
QueryPhase.searchWithCollector(SearchContext, ContextIndexSearcher, Query, LinkedList<QueryCollectorContext>, boolean, boolean) line: 224
QueryPhase.executeInternal(SearchContext) line: 179
QueryPhase.execute(SearchContext) line: 80
SearchService.loadOrExecuteQueryPhase(ShardSearchRequest, SearchContext) line: 470
SearchService.executeQueryPhase(ShardSearchRequest, SearchShardTask) line: 635
SearchService.lambda$executeQueryPhase$2(ShardSearchRequest, SearchShardTask) line: 495
0x0000000100c45530.get() line: not available
0x0000000100c45740.get() line: not available
ActionRunnable$2.accept(ActionListener<T>) line: 50
ActionRunnable$2.accept(Object) line: 47
ActionRunnable$3.doRun() line: 72
ActionRunnable$3(AbstractRunnable).run() line: 26
TimedRunnable.doRun() line: 33
ThreadContext$ContextPreservingAbstractRunnable.doRun() line: 917
ThreadContext$ContextPreservingAbstractRunnable(AbstractRunnable).run() line: 26
EWMATrackingEsThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1144
ThreadPoolExecutor$Worker.run() line: 642
Thread.run() line: 1589