I have a ES 2.2.x cluster with dedicated data, master and client nodes. Master nodes manage the cluster metadata, but where do they persist it? (i.e. where is it stored if I gracefully shutdown all of the nodes)
Thanks,
Zaar
I have a ES 2.2.x cluster with dedicated data, master and client nodes. Master nodes manage the cluster metadata, but where do they persist it? (i.e. where is it stored if I gracefully shutdown all of the nodes)
Thanks,
Zaar
The cluster state (which includes the metadata) is stored in the data directory of each node in the cluster and is updated each time the cluster state changes. So you can safely shutdown all the nodes in your cluster. Also read the following page: https://www.elastic.co/guide/en/elasticsearch/reference/2.2/restart-upgrade.html
It talks about a cluster restart in the context of a major version upgrade, but I recommend following the steps here if you do restart your cluster.
Thanks Martijn.
By "data directory of each node in the cluster", do you mean only nodes that have node.data param set to true or literally every node including masters and clients?
Can anyone please answer whether cluster metadata is stored on nodes having node.data=false
?
As Martijn stated, the cluster state is stored on ALL nodes in the cluster. This includes client nodes as they need to know the distribution of shards across the cluster in order to be able to correctly route requests.
Martijn answer was not clear enough (for me) and thus I've rephrase my question.
Suppose I have the following simplified scenario: my master and client nodes run on non-persistent storage where data is lost after restart. Now I want to fully shutdown my cluster and bring it up again (i.e. only data node disks' contents will survive). Will it work?
I don't know. Why would you want to run dedicated master nodes without persistent storage?
I'm trying to figure out ES requirements for storage for different node types. I guess I'll have to try that myself and post back here.
Why non-persistent storage for masters? - if it's not required, I have (at least) 3 disks less to worry about (health, backups, etc.)
I did some testing here. While ES was able recover "dangling" indexes and transient settings just fine, this is only one simple scenario that I have tested.
Update - more testing were done. Conclusion - metadata persistence is vital.
Thanks for sharing this, it's an interesting blog post!
I'll see if we can make this clearer in the docs.
You are correct, master nodes are clueless about erased cluster state files, they will continue to start up with an empty memory, and bad things might happen.
AFAIK cluster state is not saved on nodes with node.master=false
which is also a pitfall when taking down all master nodes while data nodes are up.
The picture in the blog post is not exact. The index state does not turn to green before a quorum of all data nodes with shards for the index is up. Before that, the state is yellow.
The rule is "do not index documents to a restarted cluster with replica shard count >0 when index state is yellow, wait for recovery of index until last node has joined". Otherwise, primary shards may flip to a shard on a node joined more recently and previously indexed documents might get lost.
See description of issues and mechanism of "allocation IDs" at https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html
What picture are you referring to exactly? In my situation masters come from empty disks and another empty data-node joins - should be all green.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.