Filebeat publishing the whole log again

Hi,

I am using filebeat to send logs to logstash. Recently I encountered an issue that filebeat publishes the whole log file again to logstash which creates duplicate entries in elasticsearch/kibana.

Please see this filebeat log:

2017-03-20T12:06:12+05:30 INFO Non-zero metrics in the last 30s: filebeat.harvester.started=1 libbeat.logstash.call_count.PublishEvents=1 publish.events=3 filebeat.harvester.open_files=1 libbeat.logstash.published_and_acked_events=2 libbeat.publisher.published_events=2 registrar.states.update=3 registrar.writes=1 filebeat.harvester.running=1 libbeat.logstash.publish.write_bytes=638 libbeat.logstash.publish.read_bytes=35
2017-03-20T12:06:42+05:30 INFO No non-zero metrics in the last 30s



2017-03-20T12:07:12+05:30 INFO No non-zero metrics in the last 30s



2017-03-20T12:07:42+05:30 INFO No non-zero metrics in the last 30s




2017-03-20T12:07:43+05:30 INFO Harvester started for file: /var/log/radius/radius-rejected.log
2017-03-20T12:07:43+05:30 ERR Failed to publish events caused by: write tcp 127.0.0.1:33786->127.0.0.1:5044: write: connection reset by peer
2017-03-20T12:07:43+05:30 INFO Error publishing events (retrying): write tcp 127.0.0.1:33786->127.0.0.1:5044: write: connection reset by peer


2017-03-20T12:08:12+05:30 INFO Non-zero metrics in the last 30s: libbeat.logstash.published_but_not_acked_events=97 registrar.writes=1 libbeat.logstash.publish.read_bytes=1521 libbeat.logstash.publish.write_bytes=6001 libbeat.logstash.publish.write_errors=2 libbeat.logstash.published_and_acked_events=97 filebeat.harvester.running=1 publish.events=98 libbeat.publisher.published_events=97 libbeat.logstash.call_count.PublishEvents=2 registrar.states.update=98 filebeat.harvester.open_files=1 filebeat.harvester.started=1
2017-03-20T12:08:42+05:30 INFO No non-zero metrics in the last 30s

So it seems that there was some connection issue to logstash and then after sometime, the filebeat publishes the full log again ( see the publishes events ) to logstash which causes duplicate entries to display in elasticsearch / kibana.

Any idea how to resolve this issue?

Thank you.

Read somewhere that the way Logstash works is "at least once" semantics so if you had some connection issues with Logstash and Filebeat didn't receive any ACK for the data it has sent so far, it will send the full log again just to be sure that it didn't miss anything (I would personally prefer this way).
If you don't like above solution, try tinkering with this property and see if it helps -> https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html#plugins-inputs-beats-congestion_threshold

Regards,
Jai

Thanks. But isn't that value deprecated?

Filebeat will only publish the events / lines again, which were not acked. Can you share your filebeat config? Which version of filebeat are you using? Which version of the LS and the plugin?

Hi, I'm new in ELK and I have exactly the same problem.
I tested with 5.2.2v and now, I'm using 5.3.0v for both (filebeat and logstash).

My filebeat.yml:

filebeat.prospectors:
- input_type: log
  paths:
    - /opt/myapp/*.log
  type: log
  fields:
    application: myapp
    tags: ['json']
    fields_under_root: true

output.logstash:
  hosts: ["server02:5044"]

My logstash conf:

input {
  beats {
    port => 5044
  }
}
filter {
  json {
    source => "message"
  }
}
output {
  stdout { codec => json_lines }

  elasticsearch {
    hosts => ["server03:9200"]
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

How do you update your log file? What log rotation algorithm are you using?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.