Logstash and RabbitMQ input duplicate messages

Hello!
I have annoying problem which i am trying to solve for like a week now and i have no idea where to look anymore...

ElasticStack 7.9.3 on 3 node cluster (CentOS 7, 3.10.0-1127.19.1.el7.x86_64).
RabbitMQ 3.8.9
Erlang 23.1.2

I'm trying to send log messages from MinIO object storage to RabbitMQ and then from RabbitMQ to Logstash input. The problem is that messages are recieved correctly, but they are always indexed in Elasticsearch as two documents with the same content except document ID.

Duplication also happens if i do test message publish from RabbitMQ GUI to the defined queue, so i guess MinIO is not the problem. Also in RabbitMQ GUI i see that there is only one message recieved in exchange and only one message pushed to queue, so i guess RabbitMQ is working fine also?

Here is a sample configuration used for logstash:

input {
    rabbitmq {
        id => "logstash-1-bucketevents"
        host => "ipaddress:5672"
        user => "secret"
        password => "secretpassword"
        heartbeat => 30
        durable => false
        queue => "bucketevents"
    }
}

output {
  elasticsearch {
    hosts => ["https://node1:9200", "https://node2:9200", "https://node3:9200"]
    user => "elastic"
    password => "secretpassword"
  }
  stdout { codec => rubydebug }
}

You get two copies of the events but you have three copies of logstash?

If so I would start by modifying the logstash configurations to always add the hostname where logstash is running. Then review the events. Do the duplicates only come from one host?

That said, this really sounds like a RabbitMQ question. If you have a single queue then messages are load balanced across clients. If you have multiple queues and a fanout or topic then the same message will go to multiple queues and therefore multiple clients.

Are you pointing path.config at a directory? If so, is it possible you have a second output configured in another file? logstash.conf.bak and logstash.conf, for example. logstash will read every file in the directory and concatenate them.

I have 3 logstash instances behind haproxy loadbalancer and pacemaker cluster.
MinIO is sending messages to Haproxy RabbitMQ VIP. RabbitMQ is configured in 3 node cluster with queue mirror. There is single exchange with MinIO and it is binded to single queue for logstash. I have configured logstash now for direct queue type and the problem is still happening.

I event tried to shutdown haproxy and PCS cluster and try to send only to single logstash instance directly and the duplicate problem is still happening.

All 3 nodes have two config files in /logstash/conf.d/:

input {
    udp {
        host => "logstash1-ip"
        port => 5514
        type => syslog
    }

tcp {
        host => "logstash1-ip"
        port => 5514
        type => syslog
    }
}

output {
  elasticsearch {
    hosts => ["https://node1:9200", "https://node2:9200", "https://node3:9200"]
    user => "elastic"
    password => "secret"
  }
  stdout { codec => rubydebug }
}
input {
    rabbitmq {
        id => "logstash-1-bucketevents"
        host => "10.11.15.141:5672"
        user => "rabbitmq"
        password => "secret"
        heartbeat => 30
        durable => false
        queue => "bucketevents"
    }
}

output {
  elasticsearch {
    hosts => ["https://node1:9200", "https://node2:9200", "https://node3:9200"]
    user => "elastic"
    password => "secret"
  }
  stdout { codec => rubydebug }
}

Well i removed output from the second file and i get only single message now :smile:
I guess i didn't understand how logstash works with multiple config files...

It is a common misunderstanding. You can use pipelines.yml and point path.config to the individual files in there if you want multiple pipelines.

Thank you very much! Will try to implement multiple pipelines for easier overview.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.