High Availablity on Elastic Search

amit.patra · August 20, 2018, 3:14pm

Hi ,

I have installed and configured Elastic-search (Multi Node) Cluster on RHEL 6.9.

Master : 192.168.2.79

Data-Node1 : 192.168.2.80

Data-Node2 : 192.168.2.81

How I make ES high available ??

Question 1 : If Data node 2 goes down here , what will happen? How will I show all data ?

Question 2 : If Storage is full under datanode1 and datanode2, then how Can I increase storage without hampering the system ??

Question 3 : If we add a extra datanode3 when datanode1 and datanode2 space is full , What will happen??

Thanks

Christian_Dahlqvist · August 20, 2018, 6:07pm

You need to have at least 3 master-eligible nodes on separate hosts. This will allow 2 nodes to reach majority even if one node is unavailable assuming you have configured minimum_mster_nodes correctly. You should therefore make your data nodes also master eligible.

Assuming you have a replica configured for all indices, the cluster will still be able to serve data.

You should monitor disk space and act before it gets full as indices will be made read-only and/or you may suffer from index corruption and data loss.

Elasticsearch will automatically redistribute data across all the data nodes available in the cluster.

amit.patra · August 21, 2018, 2:58am

Hi @Christian_Dahlqvist,

Thanks for your valuable response.

In question 1, if we configure replica, the storage will be huge. Suppose we have 5 nodes cluster . In general my cluster size is 50 TB without replica configuration. If we configured the replica the cluster size will be 50 x 5 = 250 TB (Approx) , is this a feasible solution ?
Correct me if I am wrong with the replica concept. Is there any other way to achieve nodes fail-over condition?

In question 2, Suppose we have 90% full storage and we want to add extra storage with out hampering the cluster. Do we need to backup all data directory and increase the storage ?

In question 3, Suppose we have 2 data nodes and in each node path.data size is 15 TB.
In the above scenario, If We add datanode3 , it will be automatically redistribute data and make each data node size is 10 TB??
Correct me if I am wrong.

Thanks,

Christian_Dahlqvist · August 21, 2018, 5:14am

If your primary shards take up 50TB of storage, configuring 1 replica will double this to 100TB. If you do not have at least 1 replica configured you can not have high availability.

Assuming you have a replica configured, you should be able to take down and modify/upgrade one node at a time while leaving the cluster operable.

Yes, that is basically correct.

If you have 100TB of data you are likely to need more than 2 data nodes. Elasticsearch nodes can not hold an infinite amount of data as the amount of heap available limits this. Exactly how much a node can hold will depend on the use case. Have a look at the following resources:

amit.patra · August 21, 2018, 6:03am

Hi @Christian_Dahlqvist

Thanks for your valuable response.

In Replication, If we have set number_of_replicas : 1 , Is there any chance to slow down data insertion speed. We ingests data through logstash.

Thanks,

Christian_Dahlqvist · August 21, 2018, 6:05am

Replication can slow down ingest as the same data need to be indexed twice, but that is the price to pay for increased availability and resilience.

amit.patra · August 21, 2018, 6:29am

Hi @Christian_Dahlqvist

Thanks for your valuable response.

Can we implement high availability through storage side using SAN or anything?

Thanks,

Christian_Dahlqvist · August 21, 2018, 6:43am

Different Elasticsearch nodes need different copies of the data as they are managed separately, so using a SAN will not reduce storage requirements.

amit.patra · August 21, 2018, 12:54pm

Hi @Christian_Dahlqvist

Can you please suggest how I configure Elasticsearch 5 server cluster on shared storage.

Thanks,

amit.patra · August 22, 2018, 4:42am

Hi @Christian_Dahlqvist,

My concern is , I want 5 data nodes cluster and data will be mounted same path like /es-data/ in our storage. Is this possible ? then How ?

Thanks,

Christian_Dahlqvist · August 22, 2018, 4:44am

What type of shared storage?

amit.patra · August 22, 2018, 10:50am

Hi @Christian_Dahlqvist,

SSD storage tier
A single RAID5 storage pool:
12 * 200GB EFD
250GB LUN for parent images
500GB LUN for infrastructure
75GB LUNs for replica stores (1 per node pool cluster)

Thanks,

Christian_Dahlqvist · August 22, 2018, 10:53am

How much data are you looking to store in the cluster? The size of that storage correlates badly with the 50-100TB you used as an example...

surj08 · August 23, 2018, 6:16pm

Do you know if this includes SANs with de-duplication? I have wondered if I could get redundancy on the application side without effecting space much

Christian_Dahlqvist · August 23, 2018, 6:19pm

Yes, each Elasticsearch node need its own storage.

system · September 20, 2018, 6:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How high availability of elasticsearch cluster? Elasticsearch	3	464	January 18, 2019
Best Configuration for 3 Node Cluster Elasticsearch	9	13119	July 5, 2017
Multiple master problem in elasticsearch 0.90.10 Elasticsearch	2	463	July 6, 2017
How to set up a highly available elasticsearch cluster locally? Elasticsearch	5	350	April 16, 2021
Setting up a basic high availability cluster Elasticsearch	6	1137	September 17, 2018

High Availablity on Elastic Search

Related topics