Why does using ?size=1000000 increase query latency 100X when totalHits is only 10


(Nikita Tovstoles) #1

I noticed if I specify a very high size query param for the below query, the response latency increases 100x (2ms to 200-500ms) when running locally. The total doc count is ~ 1000, totalhits for this particular query is only 10. Why the increase?

Thank you,

-nikita

{
  "query": {
    "bool": {
      "should": {
        "nested": {
          "query": {
            "term": {
              "brand.id": 551
            }
          },
          "path": "brand"
        }
      }
    }
  },
  "_source": false
}

(Nik Everett) #2

The array that collects the hits during the search phase is allocated up front. There is work around for this that only allocates max(total_docs_on_shard, requested_docs) up front but its still dangerous to do that because if you did have a ton of documents then it'd be slow again.


(Nikita Tovstoles) #3

Thanks, Nik; makes sense.


(system) #4