We have a query that returns a stupid amount of results (98k). I won't go into details as to why. To achieve this we use a scroll with 1,000 documents per batch.
We recently (finally) upgraded to v7.x of ElasticSearch and we saw the performance of this exact same query go from 9 seconds to +20 seconds. This is part of our acceptance tests suit so I have consistent round trip times before/after.
The query does not aggregations. It is a multi-field query_string
boolean query.
The performance appears to scale with results. If I limit the number of results the time falls away.
Previously, in ES v6, the request for the fist 1000 documents would take a long time ~500ms but then subsequent scroll requests would be quick ~30-40ms.
Now with ES v7 the first request is much quicker ~250ms but subsequent requests against the scroll are must slower ~150 ms. Pulling 99 batches from the scroll, this extra +100ms per batch adds up.
Is there anything in ElasticSearch v7 that changed the scroll performance?
Is there any way to optimise the scroll and get back closer to my v6 performance?