Is it safe to run a problematic node (possible corrupt data folder) while other nodes are working properly

elaskibuser · May 4, 2022, 7:29am

I have a three node setup and one of them had a hard drive fail and a long the data snapshots were gone (bad planning).
Now, we have two nodes running well (given the circumstances) and the third node is offline with the old data folder backup copied to the data folder.
This was the master node at the time of the indecent and it was not run again until now.

I plan to use the Elasticsearch-shard --remove-corrupted-data tool. however, I am still trying to figure out how to do that.

In the meantime, can I start the node to see the extent of the damage? or am I risking the current cluster situation and possibility to restore that data?

DavidTurner · May 4, 2022, 10:46am

No, it's not safe to run a node that you've restored from a filesystem-level backup. See e.g. these docs:

There are no supported methods to restore any data from a filesystem-level backup. If you try to restore a cluster from such a backup, it may fail with reports of corruption or missing files or other data inconsistencies, or it may appear to have succeeded having silently lost some of your data.

elaskibuser · May 4, 2022, 12:21pm

Thank you for your reply David. What should I do in this situation ? Should I delete the data in data folder and run the node?

because this is our only chance with that data.

Will running this node with that data folder corrupt the whole cluster?

DavidTurner · May 4, 2022, 1:03pm

It's certainly possible that restoring a node from a filesystem backup can be harmful to the cluster. It's not possible to say what will happen in your case. Filesystem backups just aren't covered by tests. As the docs say, there are no supported methods to restore from a filesystem backups.

elaskibuser · May 4, 2022, 1:19pm

Thank you David.

However, will starting the node with an empty data folder be fine?

I marked your answer as solution for the poor souls that will be in this position in the future. (hopefully none).

DavidTurner · May 4, 2022, 2:01pm

Yes, it's safe to add a new (empty) node to an existing cluster.

system · June 1, 2022, 2:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to prooceed with a data node with corrupt disk file system Elasticsearch docker	4	233	May 31, 2022
Offline backup Elasticsearch	5	1566	November 7, 2019
2 Nodes crashed, how to get last Node up an running Elasticsearch	5	239	July 29, 2022
Theory question: How to recover data from 3 node cluster after 2 nodes failed Elasticsearch	4	357	August 19, 2022
Elasticsearch filesystem recovery? Elasticsearch	4	191	June 1, 2023

Is it safe to run a problematic node (possible corrupt data folder) while other nodes are working properly

Related topics