ElasticSearch cluster with four nodes

chinmoyd · October 20, 2015, 5:09am

My java application distribution is as below:
Four physical servers has three JVMs each. Hence total 12 instances of java applications are running. Each java application logs two different log files that are captured by Logstash and fed to Elasticsearch. Kibana displays the dashboard. When I run the application in only one JVM and single instances of ELK things works fine.

I am trying to setup ELK in clustered configuration. I am using the IP of the four machines, for the convenience of explaining and reference to the log files. The ips are
172.18.17.43 -- Elasticsearch client node
172.18.17.44 -- Elasticsearch data node1
172.18.17.45 -- Elasticsearch master node
172.18.17.46 -- Elasticsearch data node 2

Logstash is installed in each of the four machines, but points to elasticsearch in the master node(172.18.17.45). Hence the logstash.conf is same for all the four machines. Kibana is installed only in the machine having the Elasticsearch client (172.18.17.43).

The start sequence of ELK is as below:
Start Elasticsearch master, then start client node, then start the data modes. Logstash is also started in the same sequence. Kibana is started at last.

ELK gets started correctly, logs also gets posted to Kibana indexes. Data gets pased correctly. But after 5-10 mins, the Elasticsearch master crashes. Sometimes, the Kibana UI does not display anything. Any clue on what is wrong will be helpful.

Extract from the configuration files:

Elasticsearch master(172.18.17.45) yml:

cluster.name: npci
node.name: "elasticsearch_master"
node.master: true
node.data: false
network.publish_host: 172.18.17.45
network.host: 172.18.17.45
transport.tcp.port: 9300
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]
http.cors.enabled: true
Elasticsearch data node1(172.18.17.44) yml

cluster.name: npci
node.name: "elasticsearch_data1"
node.master: false
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 1
network.publish_host: 172.18.17.44
network.host: 172.18.17.44
transport.tcp.port: 9301
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]
http.cors.enabled: true

Elasticsearch data node2(172.18.17.46) yml

cluster.name: npci
node.name: "elasticsearch_data2"
node.master: false
node.data: true
index.number_of_shards: 2
index.number_of_replicas: 1
network.publish_host: 172.18.17.46
network.host: 172.18.17.46
transport.tcp.port: 9303
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]
http.jsonp.enable: true

Elasticsearch client node(172.18.17.43) yml

cluster.name: npci
node.name: "elasticsearch_client"
node.master: false
node.data: false
index.number_of_shards: 0
index.number_of_replicas: 0
network.publish_host: 172.18.17.43
network.host: 172.18.17.43
transport.tcp.port: 9302
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]

The output in logstash.conf is as below:

output{
	elasticsearch { 
			host => "172.18.17.45"
			cluster => "npci"
	}

With this configuration ELK gets started correctly, logs also gets posted to Kibana indexes. Data gets pased correctly. But after 5-10 mins, the Elasticsearch master crashes. Sometimes, the Kibana UI does not display anything. Any clue on what is wrong will be helpful.

warkolm · October 20, 2015, 5:36am

What do the ES logs show?
Also you shouldn't be indexing through your master, use the client nodes instead.

magnusbaeck · October 20, 2015, 5:42am

172.18.17.43 -- Elasticsearch client node
172.18.17.44 -- Elasticsearch data node1
172.18.17.45 -- Elasticsearch master node
172.18.17.46 -- Elasticsearch data node 2

Side note: You're probably siloing your nodes prematurely. With this setup your master node is a single point of failure for the whole cluster and I suspect you don't have the query load to warrant a separate client node.

Other comments:

Why have different port settings in transport.tcp.port?
There are a couple of settings (e.g. index.number_of_shards) that differ between the nodes that most likely should be the same. If you're maintaining these files by hand you're working too hard.

chinmoyd · October 20, 2015, 11:04am

The ES logs do not show any error message.
@magnusbaeck: Are you suggesting some other setup to avoid the single point of failure of the master? I was thinking to keep single instances of Elasticsearch, Logstash and Kibana. Keeping multiple entries in the logstash.conf for log files from different application nodes. These logs will be kept in a shared drive, mounted in all the machines. Is that a good solution, considering the peak load as 5000 log entries per second for four hours.

For the TCP ports: As the Elasticsearch was crashing, I was trying different TCP ports.
Are you suggesting not to touch the elasticsearch.yml except for the following:

cluster.name: npci
node.name: "elasticsearch_client"
node.master: false
node.data: false
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]

magnusbaeck · October 20, 2015, 2:31pm

Are you suggesting some other setup to avoid the single point of failure of the master?

Have three or more master nodes? With the small number of nodes you have you don't need dedicated master nodes and probably not dedicated client nodes either.

I was thinking to keep single instances of Elasticsearch, Logstash and Kibana. Keeping multiple entries in the logstash.conf for log files from different application nodes. These logs will be kept in a shared drive, mounted in all the machines. Is that a good solution, considering the peak load as 5000 log entries per second for four hours.

You mean have a single Logstash instance that reads log files from all machines via network-mounted file systems? That should work but is quite atypical.

For the TCP ports: As the Elasticsearch was crashing, I was trying different TCP ports.
Are you suggesting not to touch the elasticsearch.yml except for the following:

Don't make random configuration changes. Yes, the parameters you listed make up a reasonable minimum set.

chinmoyd · October 20, 2015, 3:52pm

If I do not keep any dedicated master and dedicated client, I think the only line I need to un-comment in elasticsearch.yml is:

cluster.name: npci

I shall keep logstash.conf as:

output{
elasticsearch { 
host => "localhost"
    }
}

I plan to keep Kibana in only one node. Will the elasticsearch data get replicated in all nodes in this configuration?
If it gets replicated in all nodes Kibana.yml can have

elasticsearch_url: "http://localhost:9200"

Else, please suggest which Elasticsearch node will it connect to.

chinmoyd · October 21, 2015, 4:22am

Dear Magnus: Can you please share your views on the configurations in my last thread? Please let me know in case I need to mention something more.

warkolm · October 21, 2015, 4:37am

We recommend the use of unicast over multicast https://www.elastic.co/guide/en/elasticsearch/guide/current/_important_configuration_changes.html#_prefer_unicast_over_multicast

chinmoyd · October 21, 2015, 5:42am

If I do not use dedicated master node, still do I need to have the following?

discovery.zen.ping.multicast.enabled: false 
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300""]

magnusbaeck · October 21, 2015, 5:44am

If I do not keep any dedicated master and dedicated client, I think the only line I need to un-comment in elasticsearch.yml is:

cluster.name: npci

I shall keep logstash.conf as:

output{
elasticsearch { 
host => "localhost"
    }
}

If you change ES's cluster name the Logstash configuration needs to be adjusted accordingly, unless you use the HTTP protocol (which you don't, but should do).

I plan to keep Kibana in only one node. Will the elasticsearch data get replicated in all nodes in this configuration?

Actual replication of data depends on the replica count of each index. All data in a cluster is available to all nodes regardless of replication so you can connect to any cluster node. However, if you insist on having a client node you should connect to that node. That's the point of having a client node, that it relieves data nodes and master nodes from dealing directly with queries. But again, with a cluster your size and the query load I expect that you'll have it's overkill with a client node.

If I do not use dedicated master node, still do I need to have the following?

discovery.zen.ping.multicast.enabled: false 
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300""]

Yes. Newly started nodes need to be able to locate at least one other cluster node. This is unrelated to which master nodes you choose to have.

chinmoyd · October 21, 2015, 5:59am

Thanks. Let me try with the following in each of the four elasticsearch.yml

cluster.name: npci
node.name: "elasticsearch_client" <will keep four different names>
discovery.zen.ping.multicast.enabled: false 
discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"]

Will keep the logstash.conf as below:

output{
	elasticsearch { 
			host => "172.18.17.45" <will keep four different IPs in four instanses of logstash.
			cluster => "npci"
	}
}

Where should I explicitly mention for HTTP?

magnusbaeck · October 21, 2015, 6:00am

Where should I explicitly mention for HTTP?

In the elasticsearch output in your Logstash configuration. See the documentation for details.

chinmoyd · October 22, 2015, 6:43am

Now the following entry is there in all the four elasticsearch.yml.

Will it work for the elasticsearch that I am starting first, as no other Elasticsearch instance is not available, when the first elasticsearch instace is started.

magnusbaeck · October 22, 2015, 7:47am

With your current settings then yes the cluster will work with just one node up. But that's not a good situation. You should set discovery.zen.minimum_master_nodes to 3 to avoid split brain situations. Then at least three of the nodes need to be online for the cluster to work. In return, any node in the cluster can be shut down without affecting the cluster's availability.

chinmoyd · November 4, 2015, 12:10pm

discovery.zen.ping.unicast.hosts: ["172.18.17.45:9300"] -- Do we need to give 9300 or 9200.

Also, with the configuration we discussed in earlier threads ( no elastic search master) , I am getting the following error while starting logstash. Though I can ping http://10.1.1.11:9200 successfully.

INFO: I/O exception (org.apache.http.conn.HttpHostConnectException) caught                                                                                              when processing request to {}->http://10.1.1.11:9200: Connect to 10.1.1.11:                                                                                             9200 [/10.1.1.11] failed: Connection refused
Nov 04, 2015 5:23:37 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://10.1.1.11:9200
Nov 04, 2015 5:23:37 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (org.apache.http.conn.HttpHostConnectException) caught                                                                                              when processing request to {}->http://10.1.1.11:9200: Connect to 10.1.1.11:                                                                                             9200 [/10.1.1.11] failed: Connection refused

magnusbaeck · November 4, 2015, 1:12pm

Cluster nodes talk to each other on port 9300. That's the default so can omit the port setting.

Topic		Replies	Views
Deploy Cluster ES with multiple node? Elasticsearch	10	3445	July 5, 2017
Help me get my cluster actually clustering, please Elasticsearch	12	1510	July 6, 2017
2 nodes instead of one Elasticsearch	9	518	July 6, 2017
Cluster Setup 3 Node Cluster problem Elasticsearch	48	2154	August 12, 2019
Elastic Search Cluster Doubts Elasticsearch	7	1147	July 5, 2017

ElasticSearch cluster with four nodes

Related topics