ClusterBlockException occurred when node comes back to Cluster


#1

I was running resiliency tests for Elasticsearch and found that when Master or Data node recovers after being killed, my client library throws org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]; up to a second after recovery. It seems for me that client tries to send request to not fully recovered node, which causes this error. However, this does not seems to be right for me.
Does anyone has such issues?

Java client - 1.3.2 (provided by spring-data-elasticsearch:1.1.2.RELEASE)
Elasticsearch - 1.5

It might be an issue with old 1.3.2 client. Is there any way to track change list of Client releases?


(Mark Walkom) #2

That means the node has not (re)joined the cluster, you need to give it a bit more time.


#3

That makes sense, but basically that means that my requests will be lost under traffic when the node comes up.
Isn't it a Master node decision when node is ready for processing requests? Why it includes it in availability pool when it is not ready?


(Mark Walkom) #4

How do you know it's part of the cluster though? Have you checked another active node in the cluster and can see it's listed?


#5

Let me explain usecase in details:

  1. Start ES cluster with several nodes
  2. Start loading ES using client App
  3. Drop (kill) one of ES nodes.
  4. Continue loading
  5. Start dropped ES node an let it become a part of cluster -> during this process I experience mentioned exception.

That basicaly means for me, that ES node is available to handle requests from client perspective, but it is not ready.


(Mark Walkom) #6

There is a difference between the service being started, the client running and the node actually ready to push data to the cluster though.


#7

So, how can I fix it?


(Mark Walkom) #8

Check the logs to see when the node actually joins the cluster, otherwise query the _cat/nodes endpoint via a different node for the same thing.


#9

So essentially there's no other way then retry ClusterBlockException in client code to other node.


(Mark Walkom) #10

Yeah, but there is this https://github.com/elastic/elasticsearch/issues/11709


(system) #11