Hi,
I've recently upgraded our Elastic cluster from 6.2.3 to 6.6.1, out of the three nodes one of the nodePreformatted text
is not connecting to the cluster and rest of the nodes are working fine.
Here are the health of the 3 nodes.
root@data-1:~# curl -XGET https://localhost:9200/_cluster/health?pretty -k
Node0:
root@data-0:/home/usgaadmin# curl -XGET https://localhost:9200/_cluster/health?pretty -k
{
"cluster_name" : "rsi-rm-es-plt",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 5806,
"active_shards" : 11612,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 1,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Node1:
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
Node2:
root@data-2:~# curl -XGET https://localhost:9200/_cluster/health?pretty -k
{
"cluster_name" : "rsi-rm-es-plt",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 5820,
"active_shards" : 11640,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 6,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 3227,
"active_shards_percent_as_number" : 100.0
}
Kibana is also not working when upgraded from 6.2.3 to 6.6.1:
UI is showing
Kibana server is not ready yet since from the beginning.
Here are the logs from the dead node:
[2019-03-12T00:51:56,193][DEBUG][o.e.a.a.i.g.TransportGetIndexAction] [data-1] timed out while retrying [indices:admin/get] after failure (timeout [30s])
[2019-03-12T00:51:56,193][WARN ][r.suppressed ] [data-1] path: /.kibana, params: {index=.kibana}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:262) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:564) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [elasticsearch-6.6.1.jar:6.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
[2019-03-12T01:35:54,412][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [data-1] no known master node, scheduling a retry
[2019-03-12T01:36:24,413][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [data-1] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
[2019-03-12T01:36:24,413][WARN ][r.suppressed ] [data-1] path: /_cluster/health, params: {pretty=}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:262) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:564) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [elasticsearch-6.6.1.jar:6.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]