Auto importing dangled indices

Hey hoping I could get some more insight on the issue. I'm currently running ELK stack inside GKE client/master/data I've recently updated the cluster from version 1.13.X.X to 1.14.X.X during the upgrade my process ELK cluster restarted and I can't see any of the old index

But if I check the logs of the master I see the following logs with all the logstash listed [2019-10-11T19:29:15,123][INFO ][o.e.g.LocalAllocateDangledIndices] [SgPgEca] auto importing dangled indices [[logstash-2019.04.13/LGo42wdnS0-ykRsXtiZwaA]/CLOSE][[logstash-2019.09.17/qufOrIxjSNaO7mPXbXyhcw]/OPEN].....

Additionally when I exec inside the es-data pod I'm able to see all the indices listed /data/data/nodes/0/indices

Question how do I restore the index on my cluster?

This sounds like a bad situation - dangling indices indicate that something (often the master nodes) has lost data somewhere. Are you sure that all the master and data nodes are using persistent storage?

I don't understand the version numbers you're quoting. Elasticsearch version numbers have three components (e.g. 7.4.0) and there hasn't ever been an Elasticsearch version 1.13.X or 1.14.X.

Thanks for the reply.

I was referring to 1.13.X to 1.14.X with reference to kubernetes versions

the current version of elastic search is 5.2.0 the master is not using any persistent storage only the data pods have persistent storage attached ?

I see the error message when i check the logs of the master on startup

//019-10-15T04:38:26,170][ERROR][o.e.g.LocalAllocateDangledIndices] [SgPgEca] unexpected failure during [allocation dangled indices [logstash-2019.04.13, ..............

I think that explains it then. Master nodes require persistent storage. From the docs for 5.2:

Every data and master-eligible node requires access to a data directory where shards and index and cluster metadata will be stored.

If your master nodes are not using persistent storage then you have likely lost all cluster metadata. I would recommend standing up a fresh cluster and restoring from a recent snapshot.

I also recommend upgrading. Version 5.2 was released 2½ years ago and reached the end of its supported life well over a year ago. There have been many improvements to resilience and stability since then.

In short I guess all my data is gone if I don't have any persistent data on the master?

well I guess I'll have to update my version of elk :sweat_smile:

It's pretty bad to lose all the master nodes, yes. It's possible some of your indices can be re-imported as dangling ones successfully, but with essentially no guarantees of success or accuracy. A snapshot is a much more reliable recovery mechanism.

You've shared a handful of words from log messages indicating that dangling indices are also not working, but not nearly enough to see why. Can you share all the logs from all the nodes. Use https://gist.github.com/ or similar since there will not be space here.

this is from my master node with the error

only 1 of the 3 is showing this error message related to dangling indices

Ok, here's the root cause:

Caused by: java.lang.IllegalStateException: index uuid doesn't match expected: [--4fmHkyRxa1uqKfkqsoAw] but got: [DJfXvC9OQFS7WZiGk1o76g]
	... 28 more

Looking through the list I see that both of these UUIDs correspond to indices called logstash-2019.08.28. There are other duplicate names too. No idea how this has happened, but losing the cluster metadata would explain it. I think you're going to have to delete at least one of these index folders (i.e. nodes/0/indices/DJfXvC9OQFS7WZiGk1o76g/ or nodes/0/indices/--4fmHkyRxa1uqKfkqsoAw) on each node.

Did just that removed the folder on the es-data pods

I think I'll have to recreate the cluster with persistent disk all across and update the elk cluster to the newest supported versions

unfortunately

luckily it wasn't production :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.