Elasticsearch snapshot issue in kubernets

Nauman_Kyani · August 14, 2024, 3:59am

Hi ,

I am currently running an Elasticsearch cluster on Kubernetes using the Elastic Operator. My setup includes 3 master nodes, 2 client nodes, and 3 worker nodes. I have configured the cluster to use an NFS path for snapshot backups.

Recently, I encountered an issue where the NFS path went down, which caused the entire Elasticsearch cluster to become unstable and eventually go down. Upon checking the logs, I found errors related to the missing NFS path directory.

Could you please advise on the following:

Handling NFS Downtime: How can I configure Elasticsearch to handle NFS path unavailability without affecting the overall cluster stability? Are there any specific settings or configurations recommended for this scenario?
Error Handling and Resilience: Are there any best practices to ensure that Elasticsearch nodes do not crash or become unstable due to issues with the NFS path? For instance, can we set up timeouts, retries, or fallbacks?
Kubernetes Configuration: What Kubernetes configurations or features (e.g., Pod Disruption Budgets, Liveness/Readiness Probes) can be utilized to prevent Elasticsearch pods from going down due to NFS path issues?
Alternative Storage Solutions: Would you recommend using more resilient storage solutions over NFS for snapshot backups? If so, what are the alternatives and how can they be integrated with the current setup?

I appreciate your assistance in helping to make our Elasticsearch deployment more resilient to such issues.

Thank you for your support.

DavidTurner · August 14, 2024, 5:52am

What exactly do you mean by "become unstable" and "go down"?

Nauman_Kyani · August 14, 2024, 6:00am

i mean to say am setting the elasticsaerch snapshot repo with nfs path whenever my path is goes down my Elasticsearch pods are working as expected at the same time when elasticsaerch pod is restarted the cluster is down can u please help me how can I overcome thus issue?

DavidTurner · August 14, 2024, 6:38am

can u please help me how can I overcome thus issue?

Maybe, but this will not be possible without first understanding the issue. Your initial description is too vague, you need to describe the problem more precisely.

Nauman_Kyani · August 14, 2024, 6:50am

Recently, I encountered an issue where the NFS path went down, which caused the entire Elasticsearch cluster to become unstable and eventually go down. Upon checking the logs, I found errors related to the missing NFS path directory.

DavidTurner · August 14, 2024, 6:59am

This is too vague, you need to be more precise.

Nauman_Kyani · August 14, 2024, 7:10am

can we call?

DavidTurner · August 14, 2024, 7:22am

No, I'm just a volunteer here, I don't have time to spend on a call about your problem, sorry.

Nauman_Kyani · August 14, 2024, 7:23am

so please sortout my problem

DavidTurner · August 14, 2024, 7:33am

I'll do my best, but you must describe the problem first.

Nauman_Kyani · August 14, 2024, 7:42am

I have a Elastic cluster with 3 master node , 3 data with 2 client since last week my nfs path is down due to some reason due to this my Elasticsearch cluster is also down am using nfs path for Elasticsearch snapshot repo can u tell me how i mitigated that issue am using Elasticsearch opeartor

DavidTurner · August 14, 2024, 8:12am

You are still failing to describe the problem precisely enough for me to even begin to help you solve it.

Nauman_Kyani · August 14, 2024, 9:22am

i think u dont want to understand it am asking u in clear words

DavidTurner · August 14, 2024, 9:50am

You really aren't. If you rang a mechanic and said "my car doesn't work, please tell me how to fix it" would you really expect them to be able to give useful advice? That's effectively what you're doing here. I can think of hundreds of different ways that Elasticsearch could end up in a state you might describe as "down" or "unstable", each with a different resolution. You need to give much more detail.

Nauman_Kyani · August 14, 2024, 9:53am

in simple terms when my nfs path is down my Elasticsearch cluster is also down now u understand or not?

DavidTurner · August 14, 2024, 10:37am

No, you're still not describing what you mean by "down".

Nauman_Kyani · August 14, 2024, 10:41am

due to some reason NFS path is crashed when it become unavailable then my Elasticsearch unable to find the NFS path and its also crashed when I see the logs in logs unable to fine the directory which is appointed to NFS path

DavidTurner · August 14, 2024, 11:57am

You're still being remarkably coy about the details of your problem. By "crashed" do you mean that a node stopped running? Or maybe multiple nodes? If so, they would have included details of the problem in their logs.

Topic		Replies	Views
Elastic search 6.8.1 - missing indices Elasticsearch	2	395	November 9, 2021
Snapshot - Register repository for Snapshot Elasticsearch elastic-stack-monitoring	4	304	April 8, 2022
Error while performing snapshot and restore in 3 node elk cluster Elasticsearch snapshot-and-restore	1	201	August 2, 2023
How can we mount NFS mount on elastic operator based pods (Elasticsearch on Kubernetes) Elastic Cloud on Kubernetes (ECK)	4	2484	November 4, 2022
Snapshot with NFS file system Elasticsearch	4	608	December 12, 2018

Elasticsearch snapshot issue in kubernets

Related topics