Logstash output to ES cluster - sharding?

Anish_Sujanani · May 8, 2019, 6:29am

Hi all,

My cluster currently involves:
Machine 1 : ES - co-ordinating node + Kibana
Machine 2 : ES - master + data + Logstash
Machine 3 : ES - master + data + Logstash
Machine 4 : ES - master + data + Logstash
Sharding : 1 primary per index, 1 replica, Logstash creates monthly indices.

The 3 Logstash instances (Machine 2, Machine 3, Machine 4) are set to pull from Kafka - which nodes should I set the ES output to?
I have come across articles stating to add all the ES-data-eligible nodes to this list.

My question is:
With 1 primary shard per index, what happens when a document is sent to the node containing the replica shard for that index?
What happens when I add another data-only ES+Logstash node to the cluster? What happens when I add another mast+data-ES + Logstash node? Do I include or exclude these nodes from all Logstash outputs?

Would it be better to send all Logstash outputs to the co-ordinating node instead?

Thank you!

spinscale · May 8, 2019, 7:29am

data nodes is the way to go. If the primary shard is not on the node, that the client sends the document to, it will be rerouted internally.

one of the ideas here is that you do not need to worry about topology. As a user you would like to have a URL (or a list of URLs) to connect, but you dont care if your cluster is three node or a hundred.

So, why data nodes instead of coordinating nodes? There is a probability if you hit a data node, that the primary shard is local, so no forwarding needed, whereas the coordinating node will always have to forward. Also, you would sent all your data to a single coordinating node, instead of spreading the load across several data nodes in this setup.

Hope that makes sense!

--Alex

Anish_Sujanani · May 8, 2019, 8:19am

That makes perfect sense, thank you so much!

system · June 5, 2019, 8:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sending data from 2 logstash nodes to an elasticsearch cluster Elasticsearch	5	340	April 18, 2023
To which elasticsearch node should logstash send to Elasticsearch	12	8769	December 22, 2017
Shards & Replicas in Cluster Elasticsearch	4	441	July 5, 2017
How does logstash route the data to a NEW primary ES node, not the old one? Logstash	7	524	August 19, 2018
Nodes, Output and Sync Elasticsearch	5	4667	June 22, 2017

Logstash output to ES cluster - sharding?

Related topics