Handling unmounted data volume

a06ced31bae02498a46d · January 7, 2021, 5:42pm

Hello!

We are facing an issue with the store volume hardware on a given data node which results in full production outage. While the problem is clearly lies in the system/hardware level there might be some Elasticsearch-level options to mitigate it.

Problem description

Elasticsearch data nodes use a separately mounted volume as a path.data location to store all the data. Approximately on monthly basis one of the data nodes' store volume experiences a hardware issue which results in a forceful shutdown of the XFS filesystem and volume not being mounted anymore.
Mount point directory, however, still exists on the root volume. After a system restart that follows the store volume error (we can't reproduce this though), Elasticsearch effectively starts using the root volume for the data. Root volume is quite small so within a few minutes Elasticsearch fills up the disk. After breaching the flood_stage watermark Elasticsearch marks indices as read-only which results in a service outage.

Timing (~5-10mins) and unpredictability of this event leaves no space for pro-active manual intervention to avoid the production outage.
We'd like to automate the mitigation to ensure that whenever a given data node loses its data store volume it is automatically "excluded" from the cluster.

We figured that we could prevent Elasticsearch process from starting using systemd AssertPathIsMountPoint condition.
But it is unclear whether a system reboot happens consistently on store volume failure or not. And hence we can't rely on Elasticsearch being restarted in the event.
We are investigating whether using fs.xfs.panic_mask could help to force system reboot on the store volume issue but this setting intended for debugging purpose only - so unclear how safe would it be to use it in production.

Cluster information

Running on EC2, Amazon Linux 2.
ES version 6.8.3 (7.9.3 upgrade pending soon).
Data nodes are using m5d.4xlarge EC2 instance type, with two 300GB NVMe volumes mounted as a RAID0.

Questions

Did anyone else experience a similar issue?
Is there any Elasticsearch-level options/ways to ensure Elasticsearch process stops working in case of a store volume level failure?

Any info/ideas/suggestions more than welcome.

Thank you.

DavidTurner · January 7, 2021, 5:59pm

Yes, as of 7.9.0 (https://github.com/elastic/elasticsearch/pull/52680) a node will remove itself from the cluster if its filesystem goes read-only. Making the filesystem go read-only when it encounters an error is up to you. ext* filesystems have a mount option errors=remount-ro, not sure about XFS. Unmounting the filesystem on an error sounds like a very bad idea; marking it as read-only and letting it return errors to the application is much safer and is the expected behaviour.

That sounds excessively lenient. A failure to mount a filesystem should be a fatal error, you shouldn't be letting the rest of the system start up if it encounters a mount failure.

system · February 4, 2021, 5:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster locks up if master node filesystem becomes read-only Elasticsearch	3	2707	July 5, 2017
ES node remained green on VM although underlying disk failed Elasticsearch	7	498	May 16, 2019
How to make ES cluster resilient to FileSystemException Elasticsearch	10	2780	July 5, 2017
Elasticsearch cannot start when I use a hard disk mount to data path Elasticsearch	17	4142	June 17, 2017
What if my storage is full? Elasticsearch	7	2683	September 1, 2017

Handling unmounted data volume

Problem description

Cluster information

Questions

Related topics