Courier Fetch x of y shards are failed

Hello All,

I have one node cluster running since last 20 days. Its taking syslogs from 4 devices. It used work fine till yesterday. Now when I refresh my Kibana page, it gives me this error on the top. If I press OK it goes away, basically its warning. Just want to know how can I resolve this problem.

I have
Kibana : 4.3.1
ES : 2.1.1
Logstash : 2.1.1

I searched here and I got possible solution using thread pools.
i.e : threadpool.search.queue_size: 2000

I updated my elasticsearch.yml file with that but because of this I am not able to start ES node / cluster. May be this option is not supported in this ES version.

Thanks,
Gaurav

What do your ES logs say, there should be something explaining the error.

Thanks for showing interest. Here is dump from log file.

[2016-03-11 14:13:35,838][DEBUG][action.search.type ] [gaurav-node] [logstash-2016.02.26][0], node[a7_9Js2SSOe9M3APDASbwQ], [P], v[4], s[STARTED], a[id=UmCpYbDlQVCE5YoO6VeUDg]: Failed to execute [org.elasticsearch.action.search.SearchRequest@3321f4bf] lastShard [true]
RemoteTransportException[[gaurav-node][172.20.203.191:9300][indices:data/read/search[phase/query]]]; nested: EsRejectedExecutionException[rejected execution of org.elasticsearch.transport.TransportService$4@14ed14f2 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@3f06eeb4[Running, pool size = 4, active threads = 4, queued tasks = 1000, completed tasks = 84579]]];
Caused by: EsRejectedExecutionException[rejected execution of org.elasticsearch.transport.TransportService$4@14ed14f2 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@3f06eeb4[Running, pool size = 4, active threads = 4, queued tasks = 1000, completed tasks = 84579]]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:50)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:85)
at org.elasticsearch.transport.TransportService.sendLocalRequest(TransportService.java:346)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:310)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:282)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:142)
at org.elasticsearch.action.search.type.TransportSearchCountAction$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:72)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:166)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:148)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:56)
at org.elasticsearch.action.search.type.TransportSearchCountAction.doExecute(TransportSearchCountAction.java:45)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:70)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:107)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:44)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:70)
at org.elasticsearch.action.search.TransportMultiSearchAction.doExecute(TransportMultiSearchAction.java:63)
at org.elasticsearch.action.search.TransportMultiSearchAction.doExecute(TransportMultiSearchAction.java:39)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:70)
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:58)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:347)
at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:52)
at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.doExecute(BaseRestHandler.java:83)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:347)
at org.elasticsearch.client.support.AbstractClient.multiSearch(AbstractClient.java:600)
at org.elasticsearch.rest.action.search.RestMultiSearchAction.handleRequest(RestMultiSearchAction.java:74)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:54)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:207)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:166)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:128)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:86)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:348)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)

Error is shown on Kibana UI, But I dont see nay logs are generated for kibana. I have not modified in kibana config file except elasticsearch.url: parameter.

Looks like your ES node is overwhelmed.

Thanks. Do you know what is the solution on this problem ?

Add more nodes or increase the available resources to these nodes.

Appreciate your quick help.

The machine on my one node cluster is running shows this :
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/rhel-root 18307072 5231160 13075912 29% /
devtmpfs 1932128 0 1932128 0% /dev
tmpfs 1941508 0 1941508 0% /dev/shm
tmpfs 1941508 180684 1760824 10% /run
tmpfs 1941508 0 1941508 0% /sys/fs/cgroup
/dev/sda1 508588 123284 385304 25% /boot

So I dont think I am using too much resources here. Do you know how can I avoid this error in this one node cluster. I do understand that adding more nodes will probably solve this problem but I dont think I am using this machine upto its 99% capacity.

Thanks,
Gaurav

Right, but what about CPU, memory, threadpools and all that other stuff?