Node sync fails and cluster goes to "red"

Hello @DavidTurner , I wanted to thank you once again, the cluster is green, backed up and on a new disk :slight_smile: all of my research data is intact and now backed up locally on a NAS (with disk level redundancy) and in the cloud :smiley: .

I wanted to ask if it is worthwhile for metricbeat or other component of Elastic Stack to keep an eye on disk errors? I know there isn't a specific module that looks at hardware health level but based on my experience it could be an excellent indicator to be monitored under stack monitoring?

If you feel this is a meaningful addition I can open a GitHub feature request. :slight_smile:

Thank you and have a wonderful week ahead :smiley:

IMO yes, see e.g. Collect SMART data with metricbeat ยท Issue #8614 ยท elastic/beats ยท GitHub and report disk failures ยท Issue #20562 ยท elastic/beats ยท GitHub but it's far from easy since SMART metrics are hard to read and not a very reliable indicator, and the log messages are even harder to identify.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.