Mesh Configuration

Hi everyone,

I thought this works but it seems not..

Network devices send syslog to Logstash-1 and Logstash-2
So Logstash-1 and Logstash-2 have same logs received. < Confirmed.

Logstash-1 sends logs to ES-A cluster (ES-A1, ES-A2, and ES-A3)
Logstash-2 sends logs to ES-A cluster (ES-A1, ES-A2, and ES-A3)
ES-A cluster can get confused if 2 logstash servers send the same log to him?

Logstash-1 sends logs to ES-B cluster (ES-B1, ES-B2, and ES-B3)
Logstash-2 sends logs to ES-B cluster (ES-B1, ES-B2, and ES-B3)
ES-B cluster can get confused if 2 logstash servers send the same log to him?

Kibana-1 is looking at ES-A cluster
Kibana-2 is looking at ES-B cluster

When you compare a search result of Kibana-1 and Kibana-2 it is not identical.

Is this something we shouldn't do, or this mesh config should work?

Thank you for your advice in advance.

Logstash 7.4
ES 7.3
Kibana 7.3

If you have two Logstash instances sending the same data to Elasticsearch one of two things will happen. If you specify a document ID in your output you will get an insert and an update for each document, which is basically twice the indexing load. If you allow Elasticsearch to assign document IDs you will instead get duplicates in your index, which will affect the results in Kibana.

It is quite predictable, so I would not call it confusion.

Can you show us your Logstash config, especially the output part? How are they not identical?

Thanks for reply.
Here is our logstash output, no have no document id specified.

output {
elasticsearch {
  hosts => [ "es-a1.xxx:9200", "es-a2.xxx:9200", "es-a3.xxx:9200",  "es-b1.xxx:9200", "es-b2.xxx:9200", "es-b3.xxx:9200" ]
  manage_template => false
  index => "logstash-core-%{+YYYY.MM.dd}"
}

Both Logstash-1 and Logstash-2 have exact same output. We push via puppet so it can't be different.

When we get a result of kibana search, it is something like;
Kibana-1 has 38 records
Kibana-2 has 28 records
about 15 records are identical but the rest of records are unique.
Make sense?

Thanks

That configuration will end up sending each bulk request to one of the listed nodes, so it can go either to cluster A or cluster B, but not both. If you want to send to multiple clusters you will need one elasticsearch output per cluster.

Chris, It is really helpful information.
I had a quick search on output syntax but while you are there could you confirm this please?

This will do?

output {
elasticsearch {
  hosts => [ "es-a1.xxx:9200", "es-a2.xxx:9200", "es-a3.xxx:9200" ]
  hosts => [ "es-b1.xxx:9200", "es-b2.xxx:9200", "es-b3.xxx:9200" ]
  manage_template => false
  index => "logstash-core-%{+YYYY.MM.dd}"
}

Or we have to create 2 outputs?

output {
elasticsearch {
  hosts => [ "es-a1.xxx:9200", "es-a2.xxx:9200", "es-a3.xxx:9200" ]
  manage_template => false
  index => "logstash-core-%{+YYYY.MM.dd}"
}

and

output {
elasticsearch {
  hosts => [  "es-b1.xxx:9200", "es-b2.xxx:9200", "es-b3.xxx:9200" ]
  manage_template => false
  index => "logstash-core-%{+YYYY.MM.dd}"
}

Sorry Chris for asking too much..

You need two separate outputs.

Excellent!!
Thanks for all your help. Problem resolved.