Index sharing in 3 nodes cluster


(Ishan Bose) #1

Hi,

I am trying to set up a three node cluster on Elasticsearch 2.3 with PeopleSoft 9.2/8.55 Tools.

Three nodes on three Linux VM. Each node is master and data eligible. I need to setup the failover functionaity so that even if two of the nodes are going down, my third node should be elected as a Master and handle the write/index operations.
As of now I have set the Zen.discovery.minimum_master_node to 1 so that at any point of time one of the nodes will act as a Master even though it is not recommended. Will it be feasible?

Next questions is how can I confirm the indices are getting shared automatically?
Because when I do a 'du -sh *' at $ES_DATA/Cluster_Name/nodes I am getting different size for three different nodes.
With three nodes up and running I am getting say- 100 search results in PeopleSoft GUI. If I bring down one of the nodes it shows me 70-odd search results.
The search result again comes down if two of the nodes are down.

Please ignore the beginner-level naive question up next.
On all the forums I read about using the cluster health API to monitor health. But how do I use that on Linux machine?

Please suggest.

Thanks


(Christian Dahlqvist) #2

With 3 master/data nodes in a cluster, you need a minimum of two available for it to operate properly and prevent data loss and split brain scenarios. Allowing the cluster to accept reade and write with only a single node available will mean that you risk losing data.

In order to ensure that all nodes hold a copy of all data, you will need to set the number of replicas on all indices to 2.


(Ishan Bose) #3

Thanks, Christian. I understand the point you are making.
In that case then how does the failover work if two of the nodes are getting down.
I am bringing the master nodes down one by one untill there is only one node left. Then will the last node make itself master?

And talking about parameters, do I have to specify the no. of replicas or shards or any other specific parameters for the data to be synced across all the nodes with minimum node set to two?

We are putting the data at a nfs drive which is shared across all three VM. Is that a good idea or all nodes should have their own data path? What is the logic behind path.shared_data parameter?

Any help in this regard will be appreciated. Thanks again.


(Christian Dahlqvist) #4

Not if you have configured the cluster correctly, as this could lead to data loss. A minimum of 2 ondes is required for a fully functional cluster.

You can specify this through index templates.

Using NFS is generally not a good idea. All nodes should have separate data patos and local disks are recommended.


(Ishan Bose) #5

So correct me if I am wrong. In a three-node cluster the functionaity of such failover cannot be obtained. If cluster is configured properly, 1st master node down, then second master is also brought down, in that case third node will not declare itself as master becasue minimum node requirement is not fulfilled.

Could you please elaborate on this? I am completly new in this area.


(Ishan Bose) #6

Guys, need some help if you please. Still facing the same issue of data loss.


(Christian Dahlqvist) #7

Did you set zen.discovery.minimum_master_nodes to 2 in order to protect yourself against split-brain scenarios and data loss?


(Ishan Bose) #8

Thank you, Christian for your response. Yes, I did set it to two.
Let me explain what I am referring to when I say data loss.

With all three nodes running, I am getting, say 100 rows in search result.
With one node down, two nodes up I am getting 50 rows in search result.

This occurs even after setting index.no_of_replica to two.


(Christian Dahlqvist) #9

That sounds strange. Can you provide the output from the cat shards API for some off the indices where you see this problem?


(Ishan Bose) #10

Hi Christian,

I will try to get the output soon.

Thanks


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.