Search & Scroll using python


(Moshe Hayun) #1

Hi,

I have an issue where I am trying to use the search & scroll mechanism to query large amount of information.
https://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search

From time to time due to connectivity issues I am getting IncompleteRead (expected to read <x> more bytes) followed by MaxRetryError.

Using tcpdump I've noticed that I've received partial response json when this occurs.
I can't avoid the connectivity issues in my organization.

My problem is that when this error occurs, the scroll (from the elastic api) doesn't repeat the same request.
It moves on to the next request and I am losing some of the data.

E.g. I am querying 1 million records in chunks of 1000.
If I had 100 connectivity issues during the period it took, I will receive 900k records only.

I don't want to implement the same thing on my own with retries on the same request.
Is there a way using the api to retry the same request in case it fails?

Thanks a lot.


(Moshe Hayun) #2

Checking again if someone is familiar


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.