Migrated elasticsearch data from a failed node

anon85145925 · December 21, 2023, 4:57pm

Hello all,

One of our elasticsearch nodes (not a master node) failed yesterday, and by failed it was a human error, when trying to add disk space - the underlying disk was shrinked. We managed to expand the disk again, checked the filesystm and updated it as well as the initramfs and the bootloader.

Unfortunately the OS cannot be booted directly and can only be mounted in chroot at the time being. In chroot however, I can't really properly run the elastic service either to connect it back to the cluster so I can migrate the shards/indexes.

The majority of the shards/indexes are still available on the filesystem in the chroot and amount to a bit over 1TB. There are no backups of the host OS/elasticsearch data.

Questions:

Is there a way to migrate the elasticsearch data from the failed node to the cluster without the API of the broken node? (am happy to spin up a new node if needed)
Is there a way to manually copy the data (like rsync) to the new node and then manually change the permissions/edit hostnames in the config files/etc?
If the answers to both of the above are no: does that mean all of that 1TB+ elasticsearch data is just garbage and should be rm -rf?

Thanks!

DavidTurner · December 21, 2023, 5:33pm

If the contents of the data path are truly intact then yes, it will work to copy it wholesale to a new location and start up a new node there. That might be a big "if" tho, depending on how you shrank the disk it could be in a very strange state which looks intact without actually being so. Elasticsearch should detect problems and prevent wholly invalid data from contaminating the rest of your cluster, but it's impossible to say how much it will be able to recover (if anything) without trying it, and we can't wholly rule out the possibility of some undetectable corruptions that will come back to bite you later.

Needless to say, that's bad. Recovering from a snapshot would be much safer than what I propose above.

system · January 18, 2024, 5:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Moving the elastic directory Elasticsearch	5	555	July 5, 2017
Node failure and recovery Elasticsearch	3	2024	November 30, 2017
Replacing a node but keeping data disk as-is Elasticsearch	5	556	May 19, 2017
Replace failing disks on a single node Elasticsearch	4	1391	July 6, 2017
CorruptIndexException after node restart Elasticsearch	5	1033	September 26, 2017

Migrated elasticsearch data from a failed node

Related topics