Understand Cluster HA functionality

hi Guys,

I need to understand such basic functionality of ES. I have 4 node elastic cluster which is having
2 date node, 2 masters and 1 client node.

In client servers i use to run shell scripts and forwarding output to ES like,

curl -XPUT http://<>:9200/myindex/linux/$time -d ' {..}

In case node went down, is the data will automatically redirect to another data node? if it will not happen how can I overcome this?

Second thing, I want to create an index with same name for every 3 months. is there any configuration available in elasticsearch? bez the data is directly pushing from client servers to elastic.

Thanks in advance.

If you are going to have dedicated master node you should have 3 of them and you should set minimum master nodes to 2.

If node goes down there won't be a process running on that server to do any redirecting of your request. Instead you should be making requests to Elasticsearch using one of the many client libraries in stead of using cURL. cURL should only be used as a tool for manual requests to Elasticsearch from the command line, your application should be using a proper client library. These client libraries will handle the case where a node is down for you and try the request on a different node.

There is no configuration in Elasticsearch that will specifically create 3-monthly indices, your ingest application will need to specify the correct index name for the three month period relevent to the batch of documents. Logstash could help here though as in Logstash you can set configuration to determine the index name for each docuemnt from a timestamp field in the document. Other things in Elasticsearch that could assist are Index Templates and Index Aliases

You might find this section (and surrounding sections) of the "Elasticsearch: The Definitive Guide" book useful to read given what you are trying to achieve.

1 Like

Hi colings86,

Thanks for your quick reply, more than 100's of servers forwarding the custom script output to ES by using Curl which is placed inside script, so when the script is executed, Curl will push the output in JSON format to elastic through Nginx.

I am planing to go with Nginx for load balancing and its capable to manage HA of ES cluster.


In logstash output we can specify more than 1 data node, can logstash able to manage data redirection between data nodes?

I do not want to stuck with any dependency on client servers and it will be painful for me to manage and troubleshoot when a problem arise.

Please give me your inputs.