Intermittent 503 error with Scroll API

Hi, I have built a console app in .Net Core that interfaces with ElasticSearch through NEST. I'm trying to send this command:

POST /products_en-ca_b2df44e0-2a4e-433e-9ea0-ec8a9e8f9e19/product/_search?pretty=true&error_trace=true&typed_keys=true&scroll=2m
{
  "from": 0,
  "size": 100,
  "query": {
    "match_all": {}
  }
}

In my program, it is written as such:

await ElasticClient.SearchAsync<Entities.Product>(search => search
    .From(0)
    .Take(100)
    .MatchAll()
    .Scroll(ScrollTimeout)
  );

The issue I'm having is that most of the time, I get a normal response with hits and all. But sometimes, and seemingly at random, I get this response:

{
  "error" : {
    "root_cause" : [ ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [ ],
    "stack_trace" : "Failed to execute phase [query], all shards failed\r\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:293)\r\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:133)\r\n\tat org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:254)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase.onShardFailure(InitialSearchPhase.java:101)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase.access$100(InitialSearchPhase.java:48)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase$2.lambda$onFailure$1(InitialSearchPhase.java:222)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase.maybeFork(InitialSearchPhase.java:176)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase.access$000(InitialSearchPhase.java:48)\r\n\tat org.elasticsearch.action.search.InitialSearchPhase$2.onFailure(InitialSearchPhase.java:222)\r\n\tat org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73)\r\n\tat org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:51)\r\n\tat org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:464)\r\n\tat org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1130)\r\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1247)\r\n\tat org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1221)\r\n\tat org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:66)\r\n\tat org.elasticsearch.action.support.HandledTransportAction$ChannelActionListener.onFailure(HandledTransportAction.java:112)\r\n\tat org.elasticsearch.search.SearchService$2.onFailure(SearchService.java:347)\r\n\tat org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:341)\r\n\tat org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:335)\r\n\tat org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1082)\r\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723)\r\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\r\n\tat org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41)\r\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\r\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\r\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\r\n\tat java.lang.Thread.run(Unknown Source)\r\n"
  },
  "status" : 503
}

Here is the stack trace in a better format:

Failed to execute phase [query], all shards failed
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:293)
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:133)
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:254)
at org.elasticsearch.action.search.InitialSearchPhase.onShardFailure(InitialSearchPhase.java:101)
at org.elasticsearch.action.search.InitialSearchPhase.access$100(InitialSearchPhase.java:48)
at org.elasticsearch.action.search.InitialSearchPhase$2.lambda$onFailure$1(InitialSearchPhase.java:222)
at org.elasticsearch.action.search.InitialSearchPhase.maybeFork(InitialSearchPhase.java:176)
at org.elasticsearch.action.search.InitialSearchPhase.access$000(InitialSearchPhase.java:48)
at org.elasticsearch.action.search.InitialSearchPhase$2.onFailure(InitialSearchPhase.java:222)
at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73)
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:51)
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:464)
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1130)
at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1247)
at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1221)
at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:66)
at org.elasticsearch.action.support.HandledTransportAction$ChannelActionListener.onFailure(HandledTransportAction.java:112)
at org.elasticsearch.search.SearchService$2.onFailure(SearchService.java:347)
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:341)
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:335)
at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1082)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

And that's it. I have no clue what is happening. There is no more information about the error.

Thanks.

Bump

Hi,

How many time elapse between 2 call to the scroll API ?

bye,
Xavier

They are very quick. I'm chaining them as fast as possible as I want to get all the content of my index, which can contain an indeterminate number of documents. The two calls are about 80ms to 300ms between each.

And do you well pass the scroll_id to next queries when srolling ?

https://www.elastic.co/guide/en/elasticsearch/reference/6.6/search-request-scroll.html

Yes. The first call looks like that, but afterwards I'm passing the scrollId and retrieving it after each call to use on the next.

Bump

Http.Sys does have a request queue and it will send 503s if the queue gets full. The RequestQueueLimit option was added in a later version that would let you adjust the length. If the logs indicate your queue is full then consider updating and setting this. ttrockstars

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.