Getting cluster blocked exception because of node started listen before join into cluster


(Selvam) #1

Hi All,

Today we got Below exception in webapp layer ( jboss )
[SERVICE_UNAVAILABLE/1/state not recovered / initialized];[SERVICE_UNAVAILABLE/2/no master];

we are using elasticsearch.jar ( transport client ) to connect es from jboss.
elasticsearch version is 1.3.7

  1. We initiated the start command at 2016-10-18 17:52:04,263

  2. Node started to listen at
    [2016-10-18 17:52:06,491][INFO ][transport ] [node1] bound_address {inet[/XX.XX.XX.XX:YYYY]}, publish_address {inet[/XX.XX.XX.XX:YYYY]}

  3. we got exception at
    2016-10-18 17:52:08,753 :sweat:

  4. Node Joined the cluster at
    [2016-10-18 17:52:09,611][INFO ][cluster.service ] [node1] detected_master

Is there any to avoid the listen before the added into cluster.
[ or ]
How to make the node to listen req after join the cluster

Please help on this!!!


Remove a Node from elasticsearch Transport client nodes list
Remove a Node from elasticsearch Transport client nodes list
(Daniel Mitterdorfer) #2

Hi @Selvam_ayyanar,

when did this problem happen (start of the appliation)? Can you show some code? Did you do anything on Elasticsearch side?

Daniel


(Selvam) #3

Hi @danielmitterdorfer

Thanks for the reply

This exception happen in-between the node startup and node joins the cluster.

from application we tried to index a document, after joining cluster it is fine.
also after stopping node it is able to perform index operation on another node( using sniff )

when we bring back the node, it throws the exception before the node joins into the cluster


(Daniel Mitterdorfer) #4

Hi @Selvam_ayyanar,

by "node" you refer to your application in JBoss that is connected via the transport client to your Elasticsearch cluster?

Daniel


(Selvam) #5

hi @danielmitterdorfer

the node = elasticsearch instance


(Daniel Mitterdorfer) #6

Hi @Selvam_ayyanar,

can you paste the exception trace that you got?

Daniel


(Selvam) #7

Hi @danielmitterdorfer

below the exception we got from jboss

org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:745) Caused by: org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];[SERVICE_UNAVAILABLE/2/no master]; at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:138) at org.elasticsearch.action.count.TransportCountAction.checkGlobalBlock(TransportCountAction.java:128) at org.elasticsearch.action.count.TransportCountAction.checkGlobalBlock(TransportCountAction.java:64)


(Daniel Mitterdorfer) #8

Hi @Selvam_ayyanar,

great, exactly what I had expected. You just need to catch ClusterBlockException, check if it is retrieable and then do so. A very rough sketch (untested but I hope you get the idea):

int numberOfRetries = 3;
while (numberOfRetries > 0) {
    try {
        return client.doSomeAction();
    } catch (ClusterBlockException ex) {
        if (ex.retryable()) {
            numberOfRetries--;
            // wait a bit with the next retry
            Thread.sleep(1000L);
        } else {
            //not retryable -> bail
            throw ex;
        }
    }
}

Edit: I figured you should also wait a little bit between retries. :slight_smile:

Daniel


(system) #9