I noticed if I specify a very high size query param for the below query, the response latency increases 100x (2ms to 200-500ms) when running locally. The total doc count is ~ 1000, totalhits for this particular query is only 10. Why the increase?
The array that collects the hits during the search phase is allocated up front. There is work around for this that only allocates max(total_docs_on_shard, requested_docs) up front but its still dangerous to do that because if you did have a ton of documents then it'd be slow again.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.