Filebeat sending events to Logstash output more than once

I have Filebeat configured with Loadbalance: True with multiple Logstash servers in the Logstash Output. Everything works fine, I see a nice distribution across all of the Logstash instances.

To prevent duplicates I have followed this Elastic blog post, specifically by bringing our own ID. (We use a generated UUID from the logs).

The problem is now I'm facing a whole bunch of 409 errors (version conflict, document already exists). On one hand this is great as Logstash is doing it's job and preventing duplicate data in ES, however, I want to know why Logstash is processing these events more than once and why it thinks we have duplicated data.

I've been able to confirm this is almost certainly the case that Filebeat is re-sending the log lines to multiple Logstash hosts. I am tagging the Logstash hostname in the logs in ES and can confirm that a single event was ingested by Logstash host 05 and we saw a 409 15 seconds later for the same _id on my Logstash host 01. This pattern is repeated multiple times with different Logstash hosts.

I have confirmed in the actual log on the server the UUID only exists once so it is 100% not a logging issue and we do not have duplicated UUIDs in the logs.

What action can I take to prevent Filebeat from sending the same log line to multiple Logstash hosts? I feel like I've followed best practices when setting up, am I maybe missing something?

Thanks

Redacted Filebeat Config:

#=========================== Filebeat inputs =============================
filebeat.inputs:
  [{'type': 'filestream', 'enabled': True, 'paths': ['/path/to/logs/myfile.log'], 'fields_under_root': True, 'fields': {'log_type': 'payload'}}]

# ========================== Filebeat global options ===========================
filebeat.registry.path: ${path.data}/registry
filebeat.registry.file_permissions: 0600
filebeat.registry.flush: 0s
filebeat.shutdown_timeout: 5

# ================================== General ===================================
name: myhost.hostname
tags: []
fields:
  awsvpc: test
fields_under_root: True
queue:
  mem:
    events: 4096
    flush.min_events: 2048
    flush.timeout: 1s

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~

# ================================== Outputs ===================================
output.logstash:
  enabled: True
  hosts: ['logstash01.my.domain:5044', 'logstash02.my.domain:5044', 'logstash03.my.domain:5044', 'logstash04.my.domain:5044', 'logstash05.my.domain:5044', 'logstash06.my.domain:5044']
  worker: 1
  loadbalance: true
  ssl.certificate_authorities: ["/usr/local/share/ca-certificates/intca.crt"]
  ssl.certificate: "/etc/pki/filebeat.crt"
  ssl.key: "/etc/pki/filebeat.key"
  ssl.supported_protocols: "TLSv1.2"
  ssl.verification_mode: "strict"

# ================================== Logging ===================================
logging.level: info
logging.selectors: []
logging.metrics.enabled: False
logging.to_files: true
logging.files:
  path: /path/to/logs/
  name: filebeat.log
  rotateeverybytes: 52428800
  keepfiles: 7
  permissions: 0600
  interval: 24h
  rotateonstartup: false

Redacted Logstash error:

Failed action {:status=>409, :action=>["create", {:_id=>"xxx-xxx-xxx-xxx-xxx", :_index=>"my-logs", :routing=>nil}, {REDACTED}], :response=>{"create"=>{"_index"=>".ds-my-logs", "_type"=>"_doc", "_id"=>"xxx-xxx-xxx-xxx-xxx", "status"=>409, "error"=>{"type"=>"version_conflict_engine_exception", "reason"=>"[xxx-xxx-xxx-xxx-xxx]: version conflict, document already exists (current version [1])", "index"=>".ds-my-logs", "shard"=>"1", "index_uuid"=>"x-XXXX"}}}}

Does anyone have any thoughts on this, really not sure where to go from here...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.