X of x shards failed

Hi

I have a single node setup.

It looks like each index has 5 primary shards and when loading data on the kibana dashboard, we keep getting x of x shards failed error messages.

Can anyone advise how to solve this?

What do your Elasticsearch logs show?

I read somewhere I probably shouldn't have 5 shards on a single node deployment.

would adding more ES nodes solve the issue?

The whole log message is too long to post here but here is part of it -

[2019-06-19T08:18:08,250][DEBUG][o.e.a.s.TransportSearchAction] [nre3ddu] [index-2019.06.18][3], node[nre3dduXSIeBsIU67w9kQQ], [P], s[STARTED], a[id=GbCtqh0MRfG1r2gzwWhzkA]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[index-*], indicesOptions=IndicesOptions[ignore_unavailable=true, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='1560932209094', requestCache=null, scroll=null, maxConcurrentShardRequests=5, batchedReduceSize=512, preFilterShardSize=9, allowPartialSearchResults=true, source={"size":0,"query":{"bool":{"must":[{"match_all":{"boost":1.0}},{"match_all":{"boost":1.0}},{"range":{"@timestamp":{"from":1560845810780,"to":1560932210780,"include_lower":true,"include_upper":true,"format":"epoch_millis","boost":1.0}}},{"match_phrase":{"uri-path":{"query":"releases","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"bool":{"should":[{"match_phrase":{"uri-type":{"query":"sdk","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"uri-type":{"query":"client","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"uri-type":{"query":"server","slop":0,"zero_terms_query":"NONE","boost":1.0}}}],"adjust_pure_negative":true,"minimum_should_match":"1","boost":1.0}},{"bool":{"should":[{"match_phrase":{"cs-uri-stem":{"query":"zip","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"cs-uri-stem":{"query":"tar.bz2","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"cs-uri-stem":{"query":"run","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"cs-uri-stem":{"query":"dmg","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"cs-uri-stem":{"query":"unitypackage","slop":0,"zero_terms_query":"NONE","boost":1.0}}},{"match_phrase":{"cs-uri-stem":{"query":"exe","slop":0,"zero_terms_query":"NONE","boost":1.0}}}],"adjust_pure_negative":true,"minimum_should_match":"1","boost":1.0}}],"must_not":[{"match_phrase":{"sc-status":{"query":403,"slop":0,"zero_terms_query":"NONE","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":[],"excludes":[]},"stored_fields":"*","docvalue_fields":[{"field":"@timestamp","format":"date_time"},{"field":"date","format":"date_time"}],"script_fields":{},"aggregations":{"2":{"terms":{"field":"geoip.country_name.keyword","size":20,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}}}}}] lastShard [true]
org.elasticsearch.transport.RemoteTransportException: [nre3ddu][127.0.0.1:9300][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.common.util.concurrent.TimedRunnable@55384cfb on QueueResizingEsThreadPoolExecutor[name = nre3ddu/search, queue capacity = 1000, min queue capacity = 1000, max queue capacity = 1000, frame size = 2000, targeted response rate = 1s, task execution EWMA = 405nanos, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@a8117c8[Running, pool size = 13, active threads = 13, queued tasks = 1029, completed tasks = 24350639]]
        at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.5.1.jar:6.5.1]
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) ~[?:1.8.0_191]
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) ~[?:1.8.0_191]
        at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor.doExecute(QueueResizingEsThreadPoolExecutor.java:86) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.search.SearchService.lambda$rewriteShardRequest$4(SearchService.java:1074) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:114) ~[elasticsearch-6.5.1.jar:6.5.1]
        at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:87) ~[elasticsearch-6.5.1.jar:6.5.1]

I reformatted your message using the </> button because it was unreadable as posted. It's a good idea to format messages well, as they're much more likely to get a useful response that way.

I see this:

This means that the node is overloaded, having over 1000 tasks in its search queue, and so it is rejecting further searches.

Thanks.

Is it possible to increase this?

I have read this and it isn't clear - How i can increase thread pool queue size in Elasticsearch 5.6

A longer queue is normally not the solution. If the node can't keep up with the workload then it'll overflow the queue no matter how long you make it. I'd suggest either reducing the workload or else to scale out to multiple nodes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.