Does anyone else have problems with SearchContextMissingExceptions in scroll/scan operations?
I have logstash indices which each contain 10s of millions of records. I need to walk over
the entire index, processing data from each record. To do this, I'm using the elasticsearch_py
Python library, and Elasticsearch 1.6.0 on a small (4 node) cluster.
Here's my code:
import elasticsearch
import elasticsearch.exceptions
import elasticsearch.helpers as helpers
es = elasticsearch.Elasticsearch(['http://XXX.XXX.XXX.108:9200'],retry_on_timeout=True)
scanResp = helpers.scan(client=es,scroll="5m",index=index_name,timeout="5m",size=1000)
resp={}
for resp in scanResp:
DO STUFF FOR ONE RECORD
The processing is handling serveral thousand records a second when
it running, so I don't think I'm hitting the 5 minute limit.
After an indeterminate amount of time - sometimes quickly sometimes not,
I get this stack dump. I've formatted the last part for easier reading,
and redacted part of the IP addresses.
Traceback (most recent call last):
File "/home/ptrei/util/str2int.py", line 190, in <module>
mymain()
File "/home/ptrei/util/str2int.py", line 177, in mymain
process_index(indexname)
File "/home/ptrei/util/str2int.py", line 112, in process_index
for resp in scanResp:
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/helpers/__init__.py", line 230, in scan
resp = client.scroll(scroll_id, scroll=scroll)
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/client/__init__.py", line 616, in scroll
params=params, body=scroll_id)
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/transport.py", line 308, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czozNzkzNTg5ODs=","took":76,"timed_out":false,"_shards"
:{"total":5,"successful":0,"failed":5,"failures":[
{"status":404,"reason":"SearchContextMissingException[No search context found for id [13]]"},
{"status":404,"reason":"RemoteTransportException[[pegasus_101][inet[/XXX.XXX.XXX.101:9300]]
[indices:data/read/search[phase/scan/scroll]]]; nested: SearchContextMissingException[No search context found for id [15]]; "},
{"status":404,"reason":"SearchContextMissingException[No search context found for id [14]]"},
{"status":404,"reason":"RemoteTransportException[[pegasus_101][inet[/XXX.XXX.XXX.101:9300]]
[indices:data/read/search[phase/scan/scroll]]]; nested: SearchContextMissingException[No search context found for id [14]]; "},
{"status":404,"reason":"SearchContextMissingException[No search context found for id [15]]"}]
},"hits":{"total":37935898,"max_score":0.0,"hits":[]}}')
My main suspicion is that I'm running this on underpowered hardware (more on the way), but if
anyone has other theories or more insight, I'd love to hear it. Searching shows that similar
problems have been around for a while.
thanks!
Peter