Elasticssearch cluster outage scenarios

I am newbie. Sorry if I am sking Here is my environment.

ES 7.2

[es-node-1/kibana1] - cluster - [es-node-2/kibana2]

[logstash1] for network devices log and [logstash2] for server logs.

logstash servers (each logstash has different log type) are sending logs to both es-node-1 and es-node-2.

on [es-node-1],
cluster.name: my-cluster
node.name: es-node-1
network.host: 192.168.200.112
discovery.seed_hosts: ["192.168.200.112","192.168.200.156"]
cluster.initial_master_nodes: ["es-node-1", "es-node-2"]

on [es-node-2],
cluster.name: my-cluster
node.name: es-node-2
network.host: 192.168.200.156
discovery.seed_hosts: ["192.168.200.112","192.168.200.156"]
cluster.initial_master_nodes: ["es-node-1", "es-node-2"]

on [kibana1] same host of es-node-1
server.host: "192.168.200.112"
elasticsearch.hosts: ["http://192.168.200.112:9200"]

on [kibana2] same host of es-node-2
server.host: "192.168.200.156"
elasticsearch.hosts: ["http://192.168.200.156:9200"]

My questions are

  1. Even we send logs only to es-node-1, es-node-2 can sync same data to es-node-2. What if logstash sends logs to both es-node-1 and es-node-2, how the cluster can handle duplicated data?
  2. In my setup kibana is looking at only one es-node installed in the same machine. Is it possible for kibana to look 2 different es nodes, if yes how to configure?

I am trying to understand this;
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html

Thanks

You are sending data to the cluster no matter which node you send the data to and it will be stored on both nodes. If you specify both nodes in a single Elasticsearch output block Logstash will be able to fail over but will not send the same request to all nodes. If you instead use one output block per node you will get duplicate data in your indices.

You just list both Elasticsearch nodes under the elasticsearch.hosts setting as described in the documentation.

Note that having a cluster with 2 Elasticsearch nodes does not give you a highly available cluster. The reason for this is that Elasticsearch uses consensus based algorithms and require a majority of master eligible nodes to be available in order to dafely elect a master. The majority of 2 is 2 so if you lose one node your cluster will be in trouble. This is why it is recommended to always have at least 3 master eligible nodes in an Elasticsearch cluster.

Thank you so much for your reply Chris.

This is why it is recommended to always have at least 3 master eligible nodes in an Elasticsearch cluster.

Thanks its good to know.

If you instead use one output block per node you will get duplicate data in your indices.

Would you tell me on how to handle duplicate data in general when you look index on Kibana?
Making 2 different index patterns primary and secondary?
Thanks

If you set it up correctly you will not get duplicate data.

Thanks for clarifying. This pushes to 3 es nodes and whichever node I see on kibana I don't see duplicated data which is good.

output {
elasticsearch {
hosts => ["192.168.200.112:9200", "192.168.200.156:9200", "192.168.200.15:9200"]
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.