I have a ES cluster with nodes across data-centres with 1 coordinator node and 3 master+data nodes in each DC.
I have 2 questions here:
Is it any harm in using nodes which are acting as both master and data?
For minimum number of master nodes, as per the rule i should be mentioning 4 as min no. of nodes, but in that case, if in case something happens to one of the DCs, my cluster wont come back up. What is the suggested way to handle this scenario?
Unless the connection between the two data centres has very good throughput and very low latency, this type of deployment is not recommended as Elasticsearch requires good and reliable connections within a cluster.
With just two data centres, you can not configure a symmetrically highly available cluster where Elasticsearch will continue operating when any of the data centres is lost. To get this you will need a third DC.
This works for many scenarios, but can cause problems if the nodes come under high load, in which case dedicated master nodes that do not serve traffic is useful.
what kind of traffic are you mentioning here? Read or write? And is that by default for a dedicated master node?
Just to add, I am using coordinate node to call ES APIs wherever required.
Dedicated master nodes should ideally be left to just monitor the cluster. This means that they do not serve read or write traffic and can be much smaller than the data nodes.
Unfortunately, i dont have luxury of adding a third datacenter, so i'll have to live with 2 DCs. I understand the risk that either i can avoid split brain, or provide safety from datacenter failures.
So, this is what i am planning to keep
DC-1 : 1 coordinate, 1 master+data, 1 dedicated master, 2 dedicated data
DC-2 : 1 coordinate, 1 dedicated master, 3 dedicated data
min no. of master eligible nodes: 2
And, use coordinate nodes to communicate to cluster
With this config, is it too many a dedicated master nodes?
Since, there can be only one master at a time, and lets say the node which is master+data, get elected as master. In that case, can i still have issues if there is load, because as you mentioned if the nodes come under high load, dedicated master nodes are helpful.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.