I have got replica for each indice and i am getting Red cluster in stack monitoring Please Help

Its giving me High Security Alert and "Elasticsearch cluster status is red. Allocate missing primary shards and replica shards." this message

Hi,
Please go to the dev Tools in kibana (little wrench icon) and execute the following commands:

GET _cluster/health

and

GET _cat/indices?health=red

and share the ouput with us.

{
"cluster_name" : "elk",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 1,
"active_primary_shards" : 87,
"active_shards" : 87,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 41,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 67.96875
}

red open .monitoring-logstash-7-2020.05.04 iZLVxS6pS3m1qZuAUkjkMg 1 0
red open .monitoring-kibana-7-2020.05.04 PQ4hJb0VSW6658zwBcz9BA 1 0
red open .monitoring-es-7-2020.05.04 kQiRarFZRZehWVbVdb1WVQ 1 0
red open .monitoring-beats-7-2020.05.04 rHFeNC7OQqyYOWv0nxHJjg 1 0

Ok, this 4 .monitoring indicies are the problem.
Could you check & share the output of:

GET _cluster/allocation/explain

{
"index" : "abc",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2020-05-04T11:24:47.815Z",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "SxnRYDVNQniIgOchfr1oMw",
"node_name" : "data-1",
"transport_address" : "ip:9300",
"node_attributes" : {
"ml.machine_memory" : "16657059840",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[siem-os-kores][0], node[SxnRYDVNQniIgOchfr1oMw], [P], s[STARTED], a[id=qkLBWfCFRaaBKNT_LSQ2sg]]"
}
]
}
]
}

IP an index name is renamed

Here you can see the cause of another "problem" your unassigned shards.
As i can see here:

you have a 3 node cluster but only 1 data node.
A Replica Shard will always be assigned to another data node than the data node with the primary shard. If there is no other data node, the shard will not be assigned.

I dont know your infrastructure but i would recommend in a 3 node Cluster to use all 3 nodes as master and data node.

If this is not possible for you, cou can set all indicies to 0 replicas.

But for your red indicies please execute the command as follow:

GET _cluster/allocation/explain
{
  "index": ".monitoring-logstash-7-2020.05.04",
  "shard": 0,
  "primary": true
}

and share the ouput.

{
"index" : ".monitoring-logstash-7-2020.05.04",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "INDEX_REOPENED",
"at" : "2020-05-05T08:46:25.733Z",
"last_allocation_status" : "no_valid_shard_copy"
},
"can_allocate" : "no_valid_shard_copy",
"allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",
"node_allocation_decisions" : [
{
"node_id" : "SxnRYDVNQniIgOchfr1oMw",
"node_name" : "data-1",
"transport_address" : "ip:9300",
"node_attributes" : {
"ml.machine_memory" : "16657059840",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true"
},
"node_decision" : "no",
"store" : {
"found" : false
}
}
]
}

cou can set all indicies to 0 replicas.
(How to set this)

Is there any threat of Data loss if do as u say by enabling all 3 nodes as data nodes??

Thanks in advance for ur quick replies

Oh, do you removed a node from your Cluster?
This problem is described in this article: RED Elasticsearch Cluster? Panic no longer | Elastic Blog

By the explain API telling us that there is no longer a valid shard copy for our primary shard, we know we have lost all of the nodes that held a valid shard copy. At this point, the only recourse is to wait for those nodes to come back to life and rejoin the cluster. In the odd event that all nodes holding copies of this particular shard are all permanently dead, the only recourse is to use the reroute commands to allocate an empty/stale primary shard and accept the fact that data has been lost.

You can use this command:

PUT /indexname/_settings
{
  "number_of_replicas": 0
}

replace the indexname in the command with your indexname.
Here you can use wildcards like index-* or just * for all indicies

no, the data will be distributed to all data nodes without any data loss.

Thank you @KoettingSimon for ur Help i will get back soon once i try your options

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.