Where cluster metadata is stored?

(Zaar Hai) #1

I have a ES 2.2.x cluster with dedicated data, master and client nodes. Master nodes manage the cluster metadata, but where do they persist it? (i.e. where is it stored if I gracefully shutdown all of the nodes)


(Martijn Van Groningen) #2

The cluster state (which includes the metadata) is stored in the data directory of each node in the cluster and is updated each time the cluster state changes. So you can safely shutdown all the nodes in your cluster. Also read the following page: https://www.elastic.co/guide/en/elasticsearch/reference/2.2/restart-upgrade.html

It talks about a cluster restart in the context of a major version upgrade, but I recommend following the steps here if you do restart your cluster.

(Zaar Hai) #3

Thanks Martijn.

By "data directory of each node in the cluster", do you mean only nodes that have node.data param set to true or literally every node including masters and clients?

(Zaar Hai) #4

Can anyone please answer whether cluster metadata is stored on nodes having node.data=false?

(Christian Dahlqvist) #5

As Martijn stated, the cluster state is stored on ALL nodes in the cluster. This includes client nodes as they need to know the distribution of shards across the cluster in order to be able to correctly route requests.

(Zaar Hai) #6

Martijn answer was not clear enough (for me) and thus I've rephrase my question.

Suppose I have the following simplified scenario: my master and client nodes run on non-persistent storage where data is lost after restart. Now I want to fully shutdown my cluster and bring it up again (i.e. only data node disks' contents will survive). Will it work?

(Christian Dahlqvist) #7

I don't know. Why would you want to run dedicated master nodes without persistent storage?

(Zaar Hai) #8

I'm trying to figure out ES requirements for storage for different node types. I guess I'll have to try that myself and post back here.

Why non-persistent storage for masters? - if it's not required, I have (at least) 3 disks less to worry about (health, backups, etc.)

(Zaar Hai) #9

I did some testing here. While ES was able recover "dangling" indexes and transient settings just fine, this is only one simple scenario that I have tested.

Do I need to backup cluster global metadata?
(Zaar Hai) #10

Update - more testing were done. Conclusion - metadata persistence is vital.

(Mark Walkom) #11

Thanks for sharing this, it's an interesting blog post!
I'll see if we can make this clearer in the docs.

(Jörg Prante) #12

You are correct, master nodes are clueless about erased cluster state files, they will continue to start up with an empty memory, and bad things might happen.

AFAIK cluster state is not saved on nodes with node.master=false which is also a pitfall when taking down all master nodes while data nodes are up.

The picture in the blog post is not exact. The index state does not turn to green before a quorum of all data nodes with shards for the index is up. Before that, the state is yellow.

The rule is "do not index documents to a restarted cluster with replica shard count >0 when index state is yellow, wait for recovery of index until last node has joined". Otherwise, primary shards may flip to a shard on a node joined more recently and previously indexed documents might get lost.

See description of issues and mechanism of "allocation IDs" at https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html

(Zaar Hai) #13

What picture are you referring to exactly? In my situation masters come from empty disks and another empty data-node joins - should be all green.

(system) #14