How does logstash route the data to a NEW primary ES node, not the old one?

Jacob_Smith · July 22, 2018, 12:18pm

I'm trying to wrap my head around something. I have 3 ES nodes, with just 1 shard for simplicity. The first node is just the master node. The second node holds the primary shard and the third one holds the replica shard.

Master Node: 10.42.0.100:9200

Data node1 (Primary): 10.42.0.101:9200

Data node2 (Replica): 10.42.0.102:9200

This is my config from logstash, where I write the data:

output {
        elasticsearch {
                hosts  =>  ["10.42.0.101:9200"]
                index => "twitter"
                document_type => "tweet"
                template => "/etc/logstash/template/twitter_template.json"
                template_name => "twitter"
        }
}

Everything looks good and logstash will write the data to my primary ES node. However - what if that node completely dies? How do I make it failover and write to the replica node?

According the elastic.co documentation, the master node keeps track of all this and will assign a new primary node if something goes wrong. However, my logstash config doesn't know this since it's hardcoded to the first node. How can I notify logstash that the primary is down and a new one has been assigned?

First, I was thinking of this kind of configuration.

output {
        elasticsearch {
                hosts  =>  ["10.42.0.101:9200", "10.42.0.102:9200"]
                index => "twitter"
                document_type => "tweet"
                template => "/etc/logstash/template/twitter_template.json"
                template_name => "twitter"
        }
}

Writing data to both the replica and primary - but this is just wrong right? The primary already replicates data to the second node so it doesn't make any sense to write to them both.

Badger · July 22, 2018, 1:03pm

It does not write to both -- "If given an array it will load balance requests across the hosts specified in the hosts parameter."

Jacob_Smith · July 22, 2018, 1:07pm

Ok, but either way - is writing data to both the replica and primary the way to go here - in order to be fault tolerant?

magnusbaeck · July 22, 2018, 3:17pm

Ok, but either way - is writing data to both the replica and primary the way to go here - in order to be fault tolerant?

Again, saying "writing to both" is a misnomer since that's not what happens. But yes, list all known ES nodes in the elasticsearch output (and consider enabling the sniffing option) so that Logstash sends requests to any available node and lets the ES cluster figure out which node has the primary shard for each document that's to be stored. (For clusters sufficiently big to have master-only nodes it's a good idea to avoid those nodes.)

Jacob_Smith · July 22, 2018, 4:18pm

Ok, thanks. So I should list all data nodes - including the replicas? Is it possible to solely enable the sniffing option and leave the hosts parameter blank, since the sniffing adds them to the hosts list anyway?

Badger · July 22, 2018, 5:04pm

Yes. Think about what happens if 10.42.0.101 crashes. The replica gets promoted to primary and everything should keep on running. If you do not include 10.42.0.102 you will not be able to failover.

magnusbaeck · July 22, 2018, 5:32pm

So I should list all data nodes - including the replicas?

Replica nodes do not exist. Shards have primaries and (possibly) replicas. The shard a particular document ends up in is entirely determined by ES and is not observable from Logstash.

Is it possible to solely enable the sniffing option and leave the hosts parameter blank, since the sniffing adds them to the hosts list anyway?

How would the sniffing code know which ES host to contact in the first place?

system · August 19, 2018, 5:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash output to ES cluster - sharding? Elasticsearch	3	565	June 5, 2019
Sending data from 2 logstash nodes to an elasticsearch cluster Elasticsearch	5	317	April 18, 2023
Loss of a node in a cluster and the logstash client Elasticsearch	2	438	July 5, 2017
Elasticsearch cluster replication Elasticsearch	5	266	June 18, 2023
Shards & Replicas in Cluster Elasticsearch	4	441	July 5, 2017

How does logstash route the data to a NEW primary ES node, not the old one?

Related topics