ES API that return local persisted cluster state, even before it is initialized/recovered

itiyamas · July 31, 2019, 4:05pm

Is there an API that returns local persisted cluster state before the cluster state is recovered. I tried /_cluster/state?local, but it does not return the persisted state on disk, if the state is not yet initialized.

If there is no such API, I was wondering if it makes sense to add such an API. Thinking of a case where-in it makes debugging the cluster a lot easier when there is quorum loss. Instead of going through the logs, we can make this local API call and get the state. The persisted state is already parsed when the node boots up.

DavidTurner · July 31, 2019, 9:40pm

You're almost certainly talking about getting the state of the system if no master has been elected, because if the master has been elected but the state has not been recovered then the cluster state API gives useful responses. However the cluster state API is probably the wrong thing to rely on for investigating election issues because it doesn't include information about the discovery process, and you'd need this. It's also documented as unstable so you shouldn't rely on it much anyway.

The reason for the lack of such an API is that a sensible orchestration system should already have all the information it needs to organise a proper election. It'd only really be useful for in-depth troubleshooting, and for that kind of work it's usually more appropriate to use logs.

itiyamas · August 5, 2019, 12:03pm

Agreed.
But such an API would prove helpful during quorum loss to figure out the best surviving node by comparing term and version from the cluster state output. Right now, the only way to figure out the best surviving node is using logs. With the API, I shall be able to automate this process.

DavidTurner · August 5, 2019, 12:25pm

I am confused. How does it help to identify the best surviving node, given that all of the surviving nodes might be stale?

itiyamas · August 5, 2019, 3:12pm

The API can provide the cluster state metadata. The one with the highest term and cluster state version would be the latest, howsoever stale it might be, correct?

DavidTurner · August 5, 2019, 3:52pm

No, if you've lost half or more of the master-eligible nodes in the cluster then you may not have any copies of the latest cluster state left, and crucially you cannot even tell whether that's the case or not. All the remaining copies might be stale, and using them may lead to arbitrary data loss. The only safe way to proceed in those circumstances is to restore the cluster from a snapshot.

itiyamas · August 5, 2019, 5:55pm

I understand that.
I am thinking of automating the unsafe bootstrap tool for my clusters. The above API will be used to decide the best surviving nodes for this tool to do a best effort recovery.

DavidTurner · August 5, 2019, 5:58pm

This is a very bad idea.

system · September 2, 2019, 6:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Accidentally loaded old cluster state Elasticsearch	4	639	July 5, 2017
Master not discovered or elected yet, an election requires a node with id [F-Tn-Q6vQuKE0Fgi5qtUMg] + 503 master not discovered exception Elasticsearch	23	4145	April 21, 2024
Getting “master not discovered or elected yet” causing cluster not up in version 7.9.1 Elasticsearch	21	4205	November 7, 2020
Find Cluster information Index Elasticsearch	1	92	June 26, 2024
Master not discovered, removed nodes have been totally destroyed Elasticsearch	8	2023	December 4, 2019

ES API that return local persisted cluster state, even before it is initialized/recovered

Related topics