ClusterBlockException SERVICE_UNAVAILABLE/1/state not recovered

[2018-12-19T05:10:26,684][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [epcvLK5] #[org.elasticsearch.cluster.block.ClusterBlockException]#timed out while retrying [indices:admin/create] after failure (timeout [30s])
    org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
    	at org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedException(ClusterBlocks.java:189) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.admin.indices.create.TransportCreateIndexAction.checkBlock(TransportCreateIndexAction.java:64) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.admin.indices.create.TransportCreateIndexAction.checkBlock(TransportCreateIndexAction.java:39) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.doStart(TransportMasterNodeAction.java:135) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.start(TransportMasterNodeAction.java:127) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.master.TransportMasterNodeAction.doExecute(TransportMasterNodeAction.java:105) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.master.TransportMasterNodeAction.doExecute(TransportMasterNodeAction.java:55) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.HandledTransportAction$TransportHandler.messageReceived(HandledTransportAction.java:79) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.action.support.HandledTransportAction$TransportHandler.messageReceived(HandledTransportAction.java:69) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.common.util.concurrent.EsExecutors$1.execute(EsExecutors.java:135) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1512) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1382) ~[elasticsearch-6.2.3.jar:6.2.3]
    	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:64) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
    	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
    	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1336) ~[?:?]
    	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[?:?]
    	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[?:?]
    	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
    	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) ~[?:?]
    	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]

Cluster has 400 nodes with 20k shards with dedicated master instance with 16 cores and looks like it's stuck for ever getting cluster state recovered. There are no additional settings on gateway.recover*. All through this time the cluster is unresponsive

[GJueMRG] discovered [{epcvLK5}{epcvLK51S-2841WxJWTQcg}{N_OZPZ_cSA-Dy5RL7zb1dQ}{10.xxx.xx.xxx}{10.xxx.xx.xxx:9300}{ zone=us-east-1b}] which is also master but with an older cluster_state, telling [{epcvLK5}{epcvLK51S-2841WxJWTQcg}{N_OZPZ_cSA-Dy5RL7zb1dQ}{10.xxx.xx.xxx}{10.xxx.xx.xxx:9300}{zone=us-east-1b}] to rejoin the cluster ([node fd ping])

What version?
Why do you have so many nodes for such a low shard count?

ES version is : 6.2
We had just started pumping in data when this happened. Any pointers on this will be greatly appreciated. Its endlessly going into ClusterBlock->MasterNotDiscovered->ClusterBlock cycle @warkolm.

Can the cluster state get corrupted when there are multiple masters and hence cause the cluster state recovery to fail endlessly

How many master eligible nodes do you have in the cluster? How are the nodes in the cluster configured? What is the use-case?

Just before this happened we had 6 master eligible nodes with quorum 2. We however removed 3 master eligible node but still we weren't able to recover the cluster state. Is it because the cluster state would get corrupted with multiple master acting at the same time @Christian_Dahlqvist

That is, as you seem to have found out, incorrect. Make sure you always follow these guidelines.

I am not sure if this could be caused by split-brain scenario or not, so will leave that for someone more knowledgeable in this area. I would certainly not rule it out though...

What is the output of the cluster health API?

To figure out if you have a split brain, then you will need to ask every node which one it thinks is the master.

Given you have 400! nodes, that is going to be a big pain.

I did a /_cat/master on each of the master eligible nodes and found atleast 3 distinct master nodes. But even after killing rogue masters, we got a new master but the cluster state was blocked for ever

I did a /_cluster/health call on all master eligible nodes 6 previously and they returned total nodes as 404 on nodes that weren't the master 405(nodes that thought they were the master) with cluster status RED

What is the use-case? What is the specification of each node? How much data does each node in the cluster hold?

Each node is i3.xl with less than 15% of disk utilization on each.

If I calculate correctly, that means that each node holds around 50 shards with an average size of around 3GB. Is that correct?

What type of use-case is this? Is it a high throughput search use-case?

How many indices do you have? How many replica shards do you have configured?

How come you have decided to go with relatively small nodes when you could have had a smaller cluster if you instead had used i3.2xl nodes? Remember that the cluster state needs to be distributed to all nodes in the cluster, which will take more time the more nodes you have in the cluster.

Given the low data volume per node, what drove you to 400 nodes in the cluster?

This was a part of our stress tests that we wanted to do to see if we could go to 400 nodes i3.2xl with 3.5TB per instance for our total storage around 1.5PB. We didn't want to use more than 32GB JVM. Wondering if we have beefier masters why should the node count be an issue. Would it take longer for ClusterBlock recovery

What is the use-case? Without knowing this it is very hard to make any recommendations.

How do we recover from ClusterBlock at the moment. This is turning out to be really painful. The use case is around log analytics.. we need to aggregate reports.

If you are using the cluster for log analytics, you may want to consider a hot-warm architecture. This typically assumes that older data is not queried as frequently as newer data, but can save you a lot of hardware and allow you to run a smaller cluster.

You probably also want to have a look at the following resources:

Although there is no limit to cluster size built in, there is always a practical limit as distributing cluster state changes get slower with cluster size. This is why it usually makes sense to scale up until you reach ~31GB heap per node and then scale out. At some point it generally makes sense to instead start running multiple clusters and use cross-cluster search to query across them.

As I do not have much experience with ClusterBlock issues, I will need to leave that for someone else.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.