Could not get cluster status after master node goes down

Hi,

recently we have upgraded ES to 5.6.7

I have a two nodes cluster. After stopping master node, the command "curl -X GET localhost:9211/_cluster/health doesn't respond.

I've run jstack and seen that many threads are blocked. For example:

Thread 22447: (state = BLOCKED)

  • sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
  • java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)
  • java.util.concurrent.LinkedTransferQueue.awaitMatch(java.util.concurrent.LinkedTransferQueue$Node, java.util.concurrent.LinkedTransferQueue$Node, java.lang.Object, boolean, long) @bci=184, line=737 (Compiled frame)
  • java.util.concurrent.LinkedTransferQueue.xfer(java.lang.Object, boolean, int, long) @bci=286, line=647 (Compiled frame)
  • java.util.concurrent.LinkedTransferQueue.take() @bci=5, line=1269 (Compiled frame)
  • org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take() @bci=4, line=161 (Compiled frame)
  • java.util.concurrent.ThreadPoolExecutor.getTask() @bci=149, line=1067 (Compiled frame)
  • java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1127 (Interpreted frame)
  • java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
  • java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

or

Thread 21494: (state = BLOCKED)

  • org.apache.logging.log4j.core.layout.TextEncoderHelper.writeChunkedEncodedText(java.nio.charset.CharsetEncoder, java.nio.CharBuffer, org.apache.logging.log4j.core.layout.ByteBufferDestination, java.nio.ByteB
    uffer, java.nio.charset.CoderResult) @bci=5, line=112 (Interpreted frame)
  • org.apache.logging.log4j.core.layout.TextEncoderHelper.writeEncodedText(java.nio.charset.CharsetEncoder, java.nio.CharBuffer, java.nio.ByteBuffer, org.apache.logging.log4j.core.layout.ByteBufferDestination,
    java.nio.charset.CoderResult) @bci=14, line=79 (Interpreted frame)
  • org.apache.logging.log4j.core.layout.TextEncoderHelper.encodeChunkedText(java.nio.charset.CharsetEncoder, java.nio.CharBuffer, java.nio.ByteBuffer, java.lang.StringBuilder, org.apache.logging.log4j.core.layo
    ut.ByteBufferDestination) @bci=91, line=143 (Interpreted frame)
  • org.apache.logging.log4j.core.layout.TextEncoderHelper.encodeText(java.nio.charset.CharsetEncoder, java.nio.CharBuffer, java.nio.ByteBuffer, java.lang.StringBuilder, org.apache.logging.log4j.core.layout.Byte
    BufferDestination) @bci=22, line=58 (Interpreted frame)
  • org.apache.logging.log4j.core.layout.StringBuilderEncoder.encode(java.lang.StringBuilder, org.apache.logging.log4j.core.layout.ByteBufferDestination) @bci=37, line=68 (Interpreted frame)
  • org.apache.logging.log4j.core.layout.StringBuilderEncoder.encode(java.lang.Object, org.apache.logging.log4j.core.layout.ByteBufferDestination) @bci=6, line=32 (Interpreted frame)

After starting the master node again I see the following error messages in the logs:

Cheers,
Vahid

Looks like a split brain situation to me. You may have a look at this

it was working fine with two nodes with version 1.7 . There are one replica shard for indices. So if one nodes goes off, cluster should work properly. And if the off node come up it should join as well....

so you are just testing if one node goes down that you can still query the cluster.

can you show us your elasticsearch.yml

discovery.zen.ping.unicast.hosts: ["ip-first-node:9311","ip-second-node:9311"]
cluster.name: cluster-name
indices.ttl.interval: 86400s
http.port: 9211
reindex.remote.whitelist: localhost:*
transport.tcp.compress: true
transport.tcp.port: 9311
path.repo: backup-folder
discovery.zen.minimum_master_nodes: 1
bootstrap.system_call_filter: false
path.data: data-folder
network.host: 0.0.0.0
node.name: ip-first-node:10110
action.auto_create_index: false

this should be 2.
The quorum is as per here: Important Elasticsearch configuration | Elasticsearch Reference [5.6] | Elastic

So if I have only 2 nodes, and one crash, with "discovery.zen.minimum_master_nodes: 2" still works?

Actually this problem reported by our customers and they have 3 nodes, with the minimum_master_nodes set on 2. So I don't think this is the root cause.

Then your repro on a two node cluster with quorum set to 1 is wrong. please revise inline with what your customer sees and report back.

So you mean that we could not have a cluster with two nodes, if one crash the another node can not service anymore?

These are additional configurations which are applied on a three nodes cluster:

cluster.routing.allocation.awareness.attributes: sitename
node.master: true
node.data: true
discovery.zen.fd.ping_timeout: 30s
discovery.zen.fd.ping_retries: 3
indices.memory.index_buffer_size: 20%
indices.fielddata.cache.size: 10%
gateway.expected_nodes: 1
cluster.routing.allocation.allow_rebalance: indices_primaries_active
node.attr.sitename: xxx
discovery.zen.ping_timeout: 10s
gateway.recover_after_nodes: 1
gateway.recover_after_time: 5m
discovery.zen.minimum_master_nodes: 2

I've also set the minimum _master node to 2 and restarted one node. They can not find each other... It works only if I restart both nodes...

after a yaml config change (on all nodes) a restart must be actioned, on all nodes.

Thank you for your feedback. I'm curious to know if I can have a two nodes cluster and which can work even one node crash.

If you are looking for high availability you need a minimum of 3 master-eligible nodes so the remaining two nodes can form a majority and elect a new master.

It'll work only if the slave node is down. AFAIK, Re-election will take place only if both the nodes are master-eligible.

These are the configuration of three nodes:
node1:

node2:

node3:

After restarting the master node (node 1), the two rmaining nodes doesn't respond to any request.
In the logs of Node3 there are logs which saying node3 still trying to connect to gone master (node 1), instead of forming a new cluster with node 2, and the same for node 3.

Thank you for your feedbacks.
Vahid

Both are master-eligible. Anyway it seems that high availibility with two nodes with the versions higher that 1.x is not supported any more and at least three nodes must be in the cluster.

It has never been possible to have a fully highly available cluster with only two nodes.

1 Like

It was possible, we were using it for our test environments and after crashing one node, still cluster was serving properly with only one node. However with suffering from split-brain sometimes, even with three nodes...

If you were able to serve writes when one node was down in a two-node cluster, it means that you have not set minimum_master_nodes correctly. This can lead to split-brain scenarios and data loss. You should always set minimum_master_nodes according to these guidelines.

As I wrote in the initial comment, we were using ES 1.7 and for this version always there was a possibility of a split brain, even with configuring the minimum_mast_nodes to a high value (nodes/2 +1) and more than two nodes. Have a look to the old version (https://www.elastic.co/guide/en/elasticsearch/reference/1.7/modules-node.html#split-brain).

Howver now we have a bigger problem, with three nodes and mentioned configuration we have no high availibility!