Hello, I have a question.
In my index, 50,000,000 documents in there.
I want to get all documents by python like below code
import elasticsearch
import json
body_str = {"query": {"match_all": {}}}
es_client = elasticsearch.Elasticsearch("address")
doc = es_client.search(index = 'index',body = body_str, request_timeout=60, scroll='1m', size=1000)
scroll_size = len(doc['hits']['hits'])
sid = doc['_scroll_id']
total_cnt = doc['hits']['total']
while scroll_size > 0:
doc = es_client.scroll(scroll_id=sid, scroll='10m',request_timeout=60 )
print(doc)
but it takes too long time to get all documents.
I want to speed up.
The Network bandwidth python node to elasticsearch node is 1Gbit/sec
Can you give me an advice?
Thank you.