How to pull large amount of data (all documents in the index) using elasticsearch python client?

I'm using Elasticsearch.helpers.scan to pull down over 1M documents from Elasticsearch and I use match_all query for that.
the process is superslow (taking over 2hrs).
is there a better way to pull down all the documents from Elasticsearch?

PIT or Scroll

thanks @casterQ
can you elaborate? I'm using scan which I think is the wrapper utilizing Scroll. isn't it?
and can you please provide some example about PIT and Scroll

  1. snapshot(Can only be used on ES)
  2. CCR(Platinum)
  3. PIT or scroll(It is the way of pulling es, and it is recommended to use the later version of pit)

here is PIT doc:

and scroll can be accelerated using slice:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.