Indices don't have primary shards while trying to take a snapshot

Hi Team,

We have a old elasticsearch cluster running on 1.4.5. We wanted to take a snapshot of the data and delete the cluster. But we have some issues.

While taking the snapshot, we get the status as FAILED.
"state" : "FAILED",
"reason" : "Indices don't have primary shards +[[logstash-2019.10.22, logstash-2019.12.11, logstash-2019.02.12, logstash-2019.07.14, logstash-2019.11.09, logstash-2018.11.04, logstash-2019.10.08, logstash-2018.11.05, logstash-2019.12.31]]"

The cluster health is RED because there are some unassigned shards

{
"cluster_name" : "XXX",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 7,
"number_of_data_nodes" : 5,
"active_primary_shards" : 6261,
"active_shards" : 17582,
"relocating_shards" : 2,
"initializing_shards" : 0,
"unassigned_shards" : 180
}

We guess we had lost one data node and and unable to fix the unassigned_shards.

We need the help in

(i) Fixing the unassigned shards. The reason for the unassigned shards are not known. The sample output for the below command "/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED"

logstash-2019.02.12 2 r UNASSIGNED
logstash-2019.02.12 2 r UNASSIGNED

(ii) Even if we are unable to fix the unassigned_shards, we need to take the snapshot of the other shards just leaving those faulty 180 unassigned shards.

Kindly assist and let me know if you require further details.

Regards,
Muthu.

OMG.

17582 on a 5 nodes cluster?
Using a 1.4 version?

Yes, I can definitely confirm that you are/will be into trouble...

I don't really know how to fix that in the short term as this is a very old not maintained version.
I'd probably look at some few things:

  • Remove all non needed indices. I guess that you are not using all the data available from the 17582 shards... So Remove old data with the Delete Index API.
  • Start new indices with only 1 shard (well depending on your daily volume)

If possible, start a new cluster with 6.6.0 and start collecting the data in this new cluster. Once you don't need the old one, remove the old cluster.

Finally, I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.