Cluster stuck in a yellow state

Hey all,

Recently I had to replace one of my ES nodes which died suddenly on Digital Ocean.

So I ran this command to remove the node from the cluster:

curl -XPUT localhost:9200/_cluster/settings -d '{
  "transient" :{
      "cluster.routing.allocation.exclude._ip" : "10.0.0.4"
   }
}';echo

(Not using the real IP of the host that was removed in this post) And then restarted elasticsearch on each node. When I was done restarting the cluster I saw that it was stuck in a 'yellow' condition.

[root@logs:~] #curl http://localhost:9200/_cluster/health?pretty=true
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 17,
  "relocating_shards" : 0,
  "initializing_shards" : 1,
  "unassigned_shards" : 4,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

We're just using this cluster for Logstash. We didn't need any historical data in the logs, so I dust deleted all the indexes using curator, restarted all 3 elasticsearch nodes, then restarted logstash and started again. Also the indexes are backed up once per week in case they ever are needed.

Right now we only have two indices since clearing out all the old indices:

[root@logs:~] #curator show indices --all-indices
2015-11-01 15:44:26,628 INFO      Job starting: show indices
2015-11-01 15:44:26,643 INFO      Matching all indices. Ignoring flags other than --exclude.
2015-11-01 15:44:26,643 INFO      Matching indices:
.kibana
logstash-2015.10.31
logstash-2015.11.01

In the ES logs, I'm seeing these messages:

[2015-11-01 14:05:01,189][WARN ][cluster.action.shard     ] [JF_ES1] [logstash-2015.11.01][4] received shard failed for [logstash-2015.11.01][4], node[tFW2k6GdQY6_cJTDf2gbsg], relocating [4-YOtkA2T5WKtEr-j7ivnA], [P], s[RELOCATING], indexUUID [XhSTppLqRWi16GNbpkL2PA], reason [Failed to perform [indices:data/write/bulk[s]] on replica, message [SendRequestTransportException[[JF_ES2][inet[/10.10.10.5:9300]][indices:data/write/bulk[s][r]]]; nested: NodeNotConnectedException[[JF_ES2][inet[/10.10.10.5:9300]] Node not connected]; ]]
[2015-11-01 14:05:01,240][WARN ][cluster.action.shard     ] [JF_ES1] [logstash-2015.11.01][1] received shard failed for [logstash-2015.11.01][1], node[363aqoP9QCWoBr8vAkrnZw], relocating [4-YOtkA2T5WKtEr-j7ivnA], [R], s[RELOCATING], indexUUID [XhSTppLqRWi16GNbpkL2PA], reason [Failed to perform [indices:data/write/bulk[s]] on replica, message [SendRequestTransportException[[JF_ES2][inet[/10.10.10.5:9300]][indices:data/write/bulk[s][r]]]; nested: NodeNotConnectedException[[JF_ES2][inet[/10.10.10.5:9300]] Node not connected]; ]]

It's saying that node 2 is not connected. But I am able to telnet from the 1st ES node to the 2nd one using telnet:

[root@logs:~] #telnet es2.example.com 9300
Trying 10.10.10.5...
Connected to es2.example.com.
Escape character is '^]'.
^]

telnet> quit
Connection closed.
[root@logs:~] #telnet es2.example.com 9200
Trying 10.10.10.5...
Connected to es2.example.com.
Escape character is '^]'.
^]

telnet> quit
Connection closed.

I can also telnet from the 2nd ES node to the 1st ES node, though I won't demonstrate that here.

The node that the logs are complaining about is 2nd node (JF_ES2) which just replaced the dead node.

What can I do to correct this problem, and return the cluster to a Green state?

Thanks

There's no real need to do that step, once you set the exclude ES will start to move data off that node. And of course, once you set that as a transitory setting and restart the cluster, the setting is removed, which means your original change was pointless.

What does _cat/nodes show? Have you checked your ES logs?

This is what I see when I cat the nodes:

[root@logs:~] #curl localhost:9200/_cat/nodes?pretty=true
es3  66.147.235.108  7 34 0.06 d m JF_ES3
es2  66.147.235.153 13 33 0.09 d m JF_ES2
logs 216.120.248.98  5 36 0.34 d * JF_ES1

And yes! I did check the logs. Not only that I included a log snippet in the original post! And asked about what I saw! :stuck_out_tongue:

Thanks!

Check the logs of the node that everyone thinks is disconnected and see if there is anything.
Do you have minimum masters set correctly?

Nothing looks amiss in the logs from the 2nd node:

[root@es2:~] #tail -f /var/log/elasticsearch/elasticsearch.log
[2015-11-01 14:05:36,313][INFO ][node                     ] [JF_ES2] closed
[2015-11-01 14:05:37,530][WARN ][bootstrap                ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-11-01 14:05:37,695][INFO ][node                     ] [JF_ES2] version[1.7.3], pid[17610], build[05d4530/2015-10-15T09:14:17Z]
[2015-11-01 14:05:37,695][INFO ][node                     ] [JF_ES2] initializing ...
[2015-11-01 14:05:37,800][INFO ][plugins                  ] [JF_ES2] loaded [], sites []
[2015-11-01 14:05:37,851][INFO ][env                      ] [JF_ES2] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [24.1gb], net total_space [27.5gb], types [rootfs]
[2015-11-01 14:05:41,126][INFO ][node                     ] [JF_ES2] initialized
[2015-11-01 14:05:41,126][INFO ][node                     ] [JF_ES2] starting ...
[2015-11-01 14:05:41,228][INFO ][transport                ] [JF_ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/66.147.235.153:9300]}
[2015-11-01 14:05:41,245][INFO ][discovery                ] [JF_ES2] elasticsearch/4-YOtkA2T5WKtEr-j7ivnA
[2015-11-01 14:05:44,403][INFO ][cluster.service          ] [JF_ES2] detected_master [JF_ES1][363aqoP9QCWoBr8vAkrnZw][logs][inet[/10.10.10.3:9300]]{master=true}, added {[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true},[JF_ES1][363aqoP9QCWoBr8vAkrnZw][logs][inet[/10.10.10.3:9300]]{master=true},}, reason: zen-disco-receive(from master [[JF_ES1][363aqoP9QCWoBr8vAkrnZw][logs][inet[/10.10.10.3:9300]]{master=true}])
[2015-11-01 14:05:44,570][INFO ][http                     ] [JF_ES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/66.147.235.153:9200]}
[2015-11-01 14:05:44,570][INFO ][node                     ] [JF_ES2] started

I have this set as my minimum masters on each node:

discovery.zen.minimum_master_nodes: 2

Any thoughts on how I can solve this?

Thanks!

You may just need to restart that node.

Ok, I just did. Stil not seeing anything wrong in the logs on the second node.

However I am still seeing some errors on the logs on the 1st node:

[2015-11-01 16:39:02,983][WARN ][cluster.action.shard     ] [JF_ES1] [logstash-2015.11.01][1] received shard failed for [logstash-2015.11.01][1], node[363aqoP9QCWoBr8vAkrnZw], relocating [zz4JIc6mRCiBk_2v8jdC0Q], [R], s[RELOCATING], indexUUID [XhSTppLqRWi16GNbpkL2PA], reason [Failed to perform [indices:data/write/bulk[s]] on replica, message [SendRequestTransportException[[JF_ES2][inet[/66.147.235.153:9300]][indices:data/write/bulk[s][r]]]; nested: NodeNotConnectedException[[JF_ES2][inet[/66.147.235.153:9300]] Node not connected]; ]]
[2015-11-01 16:40:42,224][WARN ][cluster.action.shard     ] [JF_ES1] [logstash-2015.11.01][1] received shard failed for [logstash-2015.11.01][1], node[zz4JIc6mRCiBk_2v8jdC0Q], [R], s[INITIALIZING], unassigned_info[[reason=ALLOCATION_FAILED], at[2015-11-01T21:39:03.002Z], details[Failed to perform [indices:data/write/bulk[s]] on replica, message [SendRequestTransportException[[JF_ES2][inet[/66.147.235.153:9300]][indices:data/write/bulk[s][r]]]; nested: NodeNotConnectedException[[JF_ES2][inet[/66.147.235.153:9300]] Node not connected]; ]]], indexUUID [XhSTppLqRWi16GNbpkL2PA], reason [Failed to perform [indices:data/write/bulk[s]] on replica, message [SendRequestTransportException[[JF_ES2][inet[/66.147.235.153:9300]][indices:data/write/bulk[s][r]]]; nested: NodeNotConnectedException[[JF_ES2][inet[/66.147.235.153:9300]] Node not connected]; ]]

Sorry if this is a little sloppy. But I'm not santising the IP's this time around.

And the cluster remains in a yellow state:

[root@logs:~] #curl localhost:9200/_cluster/health?pretty=true
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 17,
  "relocating_shards" : 0,
  "initializing_shards" : 1,
  "unassigned_shards" : 4,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

And if I look in Kopf, this is what I'm seeing there:

 4 unassigned shards
 1 initializing shards
show only unhealthy indices

It may be that node 1 with the issue then? Can you restart that?

Ok, I'm taking this slow, because I really want to solve this problem. And understand the solution!

When I restarted just the 1st node, node 3 was elected as the master of the cluster. And the 2nd node dropped out of the cluster. And the cluster went into a red state instead of a yellow one!

And I saw this in the logs on the 1st node:

[2015-11-01 16:48:27,699][INFO ][transport                ] [JF_ES1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/216.120.248.98:9300]}
[2015-11-01 16:48:27,750][INFO ][discovery                ] [JF_ES1] elasticsearch/QHJdvWy8RH6-BKzjz9AWpw
[2015-11-01 16:48:29,388][DEBUG][action.admin.cluster.health] [JF_ES1] no known master node, scheduling a retry
[2015-11-01 16:48:30,350][DEBUG][action.admin.indices.create] [JF_ES1] no known master node, scheduling a retry
[2015-11-01 16:48:33,439][INFO ][cluster.service          ] [JF_ES1] detected_master [JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true}, added {[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true},}, reason: zen-disco-receive(from master [[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true}])
[2015-11-01 16:48:33,508][INFO ][node                     ] [JF_ES1] started
[2015-11-01 16:49:33,861][DEBUG][action.bulk              ] [JF_ES1] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
[2015-11-01 16:49:33,862][DEBUG][action.bulk              ] [JF_ES1] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]

Like I mentioned the cluster is now red:

[root@logs:~] #curl localhost:9200/_cluster/health?pretty=true
{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 7,
  "active_shards" : 13,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 9,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

In the logs on node 2 I see this message repeated over and over again:

[2015-11-01 16:54:52,587][INFO ][discovery.zen            ] [JF_ES2] failed to send join request to master [[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true}], reason [RemoteTransportException[[JF_ES3][inet[/66.147.235.108:9300]][internal:discovery/zen/join]]; nested: ConnectTransportException[[JF_ES2][inet[/66.147.235.153:9300]] connect_timeout[30s]]; nested: NoRouteToHostException[No route to host]; ]

So then I restarted node 3, and that elected node 1 as the master.

[root@logs:~] #curl localhost:9200/_cat/master?v
id                     host ip             node
QHJdvWy8RH6-BKzjz9AWpw logs 216.120.248.98 JF_ES1

And that improved the health of the cluster, but it's still only yellow and not green!

[root@logs:~] #curl localhost:9200/_cluster/health?pretty=true
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 11,
  "active_shards" : 17,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 5,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

And node 2 rejoins the cluster. This is the last entry that I have in the log on node 1 after restarting all 3 nodes:

[2015-11-01 16:53:46,356][DEBUG][action.bulk              ] [JF_ES1] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
[2015-11-01 16:54:46,600][DEBUG][action.bulk              ] [JF_ES1] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
[2015-11-01 16:54:46,601][DEBUG][action.bulk              ] [JF_ES1] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
[2015-11-01 16:55:51,452][INFO ][discovery.zen            ] [JF_ES1] master_left [[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true}], reason [shut_down]
[2015-11-01 16:55:51,455][WARN ][discovery.zen            ] [JF_ES1] master left (reason = shut_down), current nodes: {[JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true},}
[2015-11-01 16:55:51,455][INFO ][cluster.service          ] [JF_ES1] removed {[JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true},}, reason: zen-disco-master_failed ([JF_ES3][tFW2k6GdQY6_cJTDf2gbsg][es3][inet[/66.147.235.108:9300]]{master=true})
[2015-11-01 16:55:54,542][INFO ][cluster.service          ] [JF_ES1] new_master [JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2015-11-01 16:55:54,543][INFO ][cluster.routing          ] [JF_ES1] delaying allocation for [6] unassigned shards, next check in [59.9s]
[2015-11-01 16:55:57,593][INFO ][cluster.service          ] [JF_ES1] added {[JF_ES2][H5TNmHbPS9eubL2v6s2oEQ][es2][inet[/66.147.235.153:9300]]{master=true},}, reason: zen-disco-receive(join from node[[JF_ES2][H5TNmHbPS9eubL2v6s2oEQ][es2][inet[/66.147.235.153:9300]]{master=true}])
[2015-11-01 16:56:00,317][INFO ][cluster.service          ] [JF_ES1] added {[JF_ES3][k5uGGWSTTk-87KGP-uWPQQ][es3][inet[/66.147.235.108:9300]]{master=true},}, reason: zen-disco-receive(join from node[[JF_ES3][k5uGGWSTTk-87KGP-uWPQQ][es3][inet[/66.147.235.108:9300]]{master=true}])

This is the last log entry on node 2:

[2015-11-01 16:56:50,087][INFO ][cluster.service          ] [JF_ES2] detected_master [JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true}, added {[JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true},}, reason: zen-disco-receive(from master [[JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true}])
[2015-11-01 16:56:52,803][INFO ][cluster.service          ] [JF_ES2] added {[JF_ES3][k5uGGWSTTk-87KGP-uWPQQ][es3][inet[/66.147.235.108:9300]]{master=true},}, reason: zen-disco-receive(from master [[JF_ES1][QHJdvWy8RH6-BKzjz9AWpw][logs][inet[/216.120.248.98:9300]]{master=true}])

And this is kind of interesting! I'm not seeing ANY new log entries in the 3rd node, since July! Even tho it's part of the cluster and has been acting as the master on occasion:

[2015-07-18 18:37:19,111][INFO ][node                     ] [JF_ES3] stopped
[2015-07-18 18:37:19,111][INFO ][node                     ] [JF_ES3] closing ...
[2015-07-18 18:37:19,245][INFO ][node                     ] [JF_ES3] closed

I wonder if that fact alone, that now new log entries are not turning up on node 3 could be contributing to my problems!

I welcome your feedback on this!

That's pretty odd, there's nothing from when you restarted it earlier?

I agree, that's odd! I'm restarting node 3 and there's no new log activity turning up in the log! What's up with that I wonder?

It's the correct log file right?

Yep! Believe so.... there is a changed cluster name in the yaml file, but that's commented out.

[root@es3:~] #grep cluster.name /etc/elasticsearch/elasticsearch.yml
#cluster.name: JF_cluster

So it should just be using the default cluster name of 'elasticsearch' in that case if I'm not mistaken.

These are the only contents of the elasticsearch log directory:

[root@es3:~] #ls -l /var/log/elasticsearch/
total 20
-rw-r--r--. 1 996 995     0 Jul  1 22:45 elasticsearch-access.log
-rw-r--r--. 1 996 995     0 Jul  1 22:45 elasticsearch_index_indexing_slowlog.log
-rw-r--r--. 1 996 995     0 Jul  1 22:45 elasticsearch_index_search_slowlog.log
-rw-r--r--. 1 996 995 18561 Jul 18 18:37 elasticsearch.log

So I guess, that's another problem I'll have to try and solve. In addition to why the cluster is still yellow at this point, I need to also figure out why node 3 isn't logging any activity, while clearly remaining a part of the cluster.

[root@logs:~] #curl localhost:9200/_cat/nodes?pretty=true
es3  66.147.235.108 14 35 0.04 d m JF_ES3
logs 216.120.248.98  6 37 0.65 d * JF_ES1
es2  66.147.235.153 12 33 0.00 d m JF_ES2

Sorry for the delay in responding, btw. I had to run out to the store.

Hey Warkolm,

I was actually able to get my cluster out of it's 'yellow' state using the advice from this article:

How to fix your Elasticsearch cluster stuck in initializing shards mode?

But thanks for your help and advice all along!

Tim