Elasticsearch-dsl Python Library Using Scan

My code here essentially is

for hit in search.scan():
    do something

I believe it will use pagination/scrolling on this, and it seems that after a while, it will have this error message:

17:23:34 for hit in search.scan():
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch_dsl/search.py", line 719, in scan 17:23:34 **self._params
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/helpers/actions.py", line 469, in scan
17:23:34 body={"scroll_id": scroll_id, "scroll": scroll}, **scroll_kwargs
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
17:23:34 return func(*args, params=params, **kwargs)
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/client/init.py", line 1395, in scroll
17:23:34 "GET", "/_search/scroll", params=params, body=body
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/transport.py", line 358, in perform_request
17:23:34 timeout=timeout,
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 261, in perform_request
17:23:34 self._raise_error(response.status, raw_data)
17:23:34 File "/usr/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 182, in _raise_error
17:23:34 status_code, error_message, additional_info
17:23:34 elasticsearch.exceptions.NotFoundError: NotFoundError(404, u'search_phase_execution_exception', u'No search context found for id [19079999]')

I've read that this is because it took too long for the scroll to respond, and that I would have to calibrate it according to how my cluster is set up (currently, a single cluster with a single node. Unfortunately, cannot upgrade it yet).

How do I calibrate it? What do I look for to ensure good performance? Any suggestion is appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.