ES Cluster Health in yellow due to some Replica shards in Unassigned state

damvinod · November 22, 2018, 9:13am

We have elastic search Production cluster environment with 4 nodes

Node 1:
node.master: true
Node 2:
node.master: true
node.data: true
Node 3:
node.master: true
node.data: true
Node 4: Coordinating node
node.master: false
node.data: false
node.ingest: false

We are facing an issue of replica shards getting allocated on the Node 2.
No issues with the Primary shards in Node 3 which is the current Master node

Logs from Node 2 is as below:

[2018-11-22T16:42:48,849][INFO ][o.e.d.z.ZenDiscovery ] [node-plmspapgs0g] master_left [{node-plmspapgs0h}{ADnQ6UDZRoOFFvnkcdh3Sw}{R17y6k_ZTyOTzIby9xlTIw}{X.X.X.X}}{X.X.X.X}:9300}{ml.machine_memory=33559379968, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2018-11-22T16:42:48,850][WARN ][o.e.d.z.ZenDiscovery ] [node-plmspapgs0g] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout), current nodes: nodes:
{node-plmspapgs0i}{31N0LIJiRtG3kfWsSoD2Iw}{RVg25n_5TomjJpi307wBVQ}{X.X.X.X}{X.X.X.X}:9300}{ml.machine_memory=16649068544, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}
{node-plmspapgs0e}{z49EVuIYQpOERYIYC2ZELA}{5WJEVl-NRtGH7A7qYUBxDg}{X.X.X.X}}{X.X.X.X}:9300}{ml.machine_memory=33559379968, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}
{node-plmspapgs0h}{ADnQ6UDZRoOFFvnkcdh3Sw}{R17y6k_ZTyOTzIby9xlTIw}{X.X.X.X}}{X.X.X.X}:9300}{ml.machine_memory=33559379968, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}, master
{node-plmspapgs0g}{ffFONm7QQRqTkB8cH2DlVg}{wB9thnAdSE256v41YipqTA}{X.X.X.X}}{X.X.X.X}:9300}{ml.machine_memory=33559379968, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, local

[2018-11-22T16:42:50,007][WARN ][r.suppressed ] path: /_xpack/monitoring/_bulk, params: {system_id=beats, system_api_version=6, interval=10s}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:166) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:152) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:56) ~[?:?]
at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:36) ~[?:?]
at org.elasticsearch.action.support.TransportAction.doExecute(TransportAction.java:143) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:128) ~[?:?]

As a workaround:

After deleting all the indexes cluster health is in Green for the first 4-5 days.
The cluster turned to Yellow because of replicas not being allocated to Node 2

Please try to help as we are facing this issue for very long time.

geekpete · December 2, 2018, 11:25pm

What is your minimum masters setting?
Is it the same on all nodes?

The logs indicate that Node 2 cannot find a master node, do you have any networking issues between Node 2 and the other nodes or generally?

Do you have any shard allocation awareness in play?
https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html

When you see the cluster go yellow can you provide an Allocation Explain output?
GET /_cluster/allocation/explain

What replica settings do you have on the indices that go yellow when losing 1 of 3 data nodes in the cluster?

system · December 30, 2018, 11:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Elasticsearch 7.6] How do I fix unassigned shards issue Elasticsearch	8	3336	May 18, 2020
Yellow Cluster Health from unassigned_shards Elasticsearch	9	8927	December 1, 2017
Elastic 5.6.4 allocated primary and replica shard to same data node causing ES health check issue Elasticsearch	6	988	June 22, 2018
ElasticSearch status is yellow (Unassigned Shards) Elasticsearch	9	7685	July 5, 2017
Single node yellow Elasticsearch	3	445	July 4, 2023

ES Cluster Health in yellow due to some Replica shards in Unassigned state

Related topics