SearchContextMissingException during long scroll/scan operations?

ptrei · June 16, 2015, 5:59pm

Does anyone else have problems with SearchContextMissingExceptions in scroll/scan operations?

I have logstash indices which each contain 10s of millions of records. I need to walk over
the entire index, processing data from each record. To do this, I'm using the elasticsearch_py
Python library, and Elasticsearch 1.6.0 on a small (4 node) cluster.

Here's my code:

import elasticsearch
import elasticsearch.exceptions 
import elasticsearch.helpers as helpers

 es = elasticsearch.Elasticsearch(['http://XXX.XXX.XXX.108:9200'],retry_on_timeout=True)  

  scanResp = helpers.scan(client=es,scroll="5m",index=index_name,timeout="5m",size=1000)

  resp={}
  for resp in scanResp:
    DO STUFF FOR ONE RECORD

The processing is handling serveral thousand records a second when
it running, so I don't think I'm hitting the 5 minute limit.

After an indeterminate amount of time - sometimes quickly sometimes not,
I get this stack dump. I've formatted the last part for easier reading,
and redacted part of the IP addresses.

Traceback (most recent call last):
  File "/home/ptrei/util/str2int.py", line 190, in <module>
    mymain()
  File "/home/ptrei/util/str2int.py", line 177, in mymain
    process_index(indexname)
   File "/home/ptrei/util/str2int.py", line 112, in process_index
for resp in scanResp:
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/helpers/__init__.py", line 230, in scan
    resp = client.scroll(scroll_id, scroll=scroll)
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/client/__init__.py", line 616, in scroll
    params=params, body=scroll_id)
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/transport.py", line 308, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/connection/http_urllib3.py", line 86, in perform_request
self._raise_error(response.status, raw_data)
  File "/usr/lib/python2.6/site-packages/elasticsearch-1.4.0-py2.6.egg/elasticsearch/connection/base.py", line 102, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.NotFoundError: TransportError(404, u'{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czozNzkzNTg5ODs=","took":76,"timed_out":false,"_shards"
:{"total":5,"successful":0,"failed":5,"failures":[

{"status":404,"reason":"SearchContextMissingException[No search context found for id [13]]"},
{"status":404,"reason":"RemoteTransportException[[pegasus_101][inet[/XXX.XXX.XXX.101:9300]]
  [indices:data/read/search[phase/scan/scroll]]]; nested: SearchContextMissingException[No search context found for id [15]]; "},

{"status":404,"reason":"SearchContextMissingException[No search context found for id [14]]"},
{"status":404,"reason":"RemoteTransportException[[pegasus_101][inet[/XXX.XXX.XXX.101:9300]]
  [indices:data/read/search[phase/scan/scroll]]]; nested: SearchContextMissingException[No search context found for id [14]]; "},

{"status":404,"reason":"SearchContextMissingException[No search context found for id [15]]"}]

},"hits":{"total":37935898,"max_score":0.0,"hits":[]}}')

My main suspicion is that I'm running this on underpowered hardware (more on the way), but if
anyone has other theories or more insight, I'd love to hear it. Searching shows that similar
problems have been around for a while.

thanks!
Peter

colings86 · June 17, 2015, 8:20am

Is your scroll code using the new scroll id which is sent on each scroll response? A common problem here is that people try to use the original scroll id for all scroll requests which would result in errors similar to this.

ptrei · June 17, 2015, 3:14pm

I'm using elasticsearch-py, a python wrapper for the API. I'm under the impression that this handles the
scroll-id internally - I certainly don't see it myself.

Sometimes, the process runs correctly for many hours before completing, or failing. Failure seems to
vaguely correlate with how heavily the ES cluster is being used, which is why I suggest that my HW
is to blame.

Peter

Jason_Wee · June 19, 2015, 2:36am

could it be the scroll id expired?

ptrei · June 26, 2015, 8:19pm

Jason:
The scroll operation sometimes runs for hours without a problem, at others it fails within a few minutes. I don't thinkits an expiration. I set the timeout to 5 minutes, which is way up from the default 10 seconds. I've tried tinkering with the size of the scroll - dropping to 50 (from 800-1000) seems to help, but not entirely.

bmcalindin · April 1, 2016, 11:15am

We've been having the same issue with our cluster. We are doing many scroll queries with the code between queries taking much less time than the scroll timeout. When the query is searching over a large dataset it has been failing with this error after several hundred iterations.

Did you ever find a solution here?

Aman_Astana · October 6, 2016, 2:01am

is any solution for this? have anyone tried this on latest version of ES?
for me no reason to use ES if this problem can't be solved.

m1cha3lf · February 23, 2017, 12:25pm

we're having exactly the same issue. Did anyone find the exact reason or maybe a workaround?

ptrei · February 23, 2017, 6:50pm

I'm the OP.

Sorry guys, I never found an ES solution, and wound up using Splunk to digest my raw data.
I have since moved on to other projects.

yuriybash · June 14, 2017, 8:20pm

I'm also experiencing this currently, we're on ES 1.7 using the elasticsearch python library.

Anusha_Eagalapati · December 18, 2017, 12:58pm

Even i am facing the same issue.Did someone found a solution for this?!!

Topic		Replies	Views
Why do my scroll operations fail? Elasticsearch	1	1786	July 6, 2017
Elasticsearch SearchContextMissingException during 'scan & scroll' query with Spring Data Elasticsearch Elasticsearch	2	5946	July 5, 2017
Scansearch failure Elasticsearch	2	672	July 6, 2017
Scan Search error Elasticsearch	1	504	July 6, 2017
Scansearch failure Elasticsearch	1	341	July 6, 2017

SearchContextMissingException during long scroll/scan operations?

Related topics