Filebeat Publish Issue

Hello!

We're using Filebeat (5.6) storing to Elasticsearch (5.6). We suddenly noticed that we're not receiving data from FIlebeat after couple of hours.

The filebeat config looks like this:

filebeat.prospectors:
- input_type: log
  paths:
    - /logs/filebeat.log
  encoding: utf-8

processors:
  - drop_fields:
      fields: ["beat","input_type","offset","source"]

output.elasticsearch:
  hosts: ["elasticsearch-ingest-1.internal:9200","elasticsearch-ingest-2.internal:9200"]
  pipeline: filebeat-pipeline
  bulk_max_size: 1024
  timeout: 90

logging.level: info

Based on the logs of filebeat, the data wasn't ack'ed by ES after having an increase of publish events to 13K from 306.

2018-11-13T23:59:50+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=6 libbeat.es.publish.read_bytes=5946 libbeat.es.publish.write_bytes=494847 libbeat.es.published_and_acked_events=306 libbeat.publisher.published_events=306 publish.events=306 registrar.states.update=306 registrar.writes=6
2018-11-14T00:00:08+08:00 INFO Harvester started for file: /logs/filebeat.log
2018-11-14T00:00:20+08:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=1 filebeat.harvester.running=1 filebeat.harvester.started=1 libbeat.es.call_count.PublishEvents=13604 libbeat.es.publish.read_bytes=3756603 libbeat.es.publish.write_bytes=9550702 libbeat.es.published_and_acked_event
s=107 libbeat.es.published_but_not_acked_events=13600 libbeat.publisher.published_events=108 publish.events=71 registrar.states.update=71 registrar.writes=3
2018-11-14T00:00:50+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=40450 libbeat.es.publish.read_bytes=11164549 libbeat.es.publish.write_bytes=27910500 libbeat.es.published_but_not_acked_events=40450
2018-11-14T00:01:20+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=44151 libbeat.es.publish.read_bytes=12185779 libbeat.es.publish.write_bytes=30464190 libbeat.es.published_but_not_acked_events=44151
2018-11-14T00:01:50+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=44582 libbeat.es.publish.read_bytes=12304708 libbeat.es.publish.write_bytes=30761580 libbeat.es.published_but_not_acked_events=44582
2018-11-14T00:02:20+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=44234 libbeat.es.publish.read_bytes=12208704 libbeat.es.publish.write_bytes=30520770 libbeat.es.published_but_not_acked_events=44234

After restarting the filebeat, it was able to publish all the data:

2018-11-14T10:00:04+08:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=1 filebeat.harvester.running=1 filebeat.harvester.started=1 filebeat.prospector.log.files.truncated=1 libbeat.es.call_count.PublishEvents=46 libbeat.es.publish.read_bytes=490354 libbeat.es.publish.write_bytes=76342445 libbeat.es.published_and_acked_events=44407 libbeat.publisher.published_events=44407 publish.events=44429 registrar.states.current=21 registrar.states.update=44429 registrar.writes=24

From the logs above, it looks like it was able to publish events even it reached 44407, compared to the previous log which was not ack'ed with the same number of events.

For the number of events published to ES, do I need to look to libbeat.es.call_count.PublishEvents or libbeat.publisher.published_events? Because there were no changes in Filebeat's config but it worked after the restart.

Thank you.

The last metrics before restart do indicate the elasticsearch did not ACK any event. After restart it did indeed ACK a many events.

Do you have some error logs as well? Also check for errors in the Elasticsearch logs.

Maybe debug logs from filebeat with debug selector "elasticsearch" will give us even some more details.

Sorry I forgot to mention that there's no error or unusual log in Elasticsearch. We have other nodes that are publishing to Elasticsearch using Filebeat, and only this specific node encountered the issue.

I'm afraid that the log level of Filebeat during this was in INFO so there's no other logs beside the aforementioned one.

I'm afraid that the log level of Filebeat during this was in INFO so there's no other logs beside the aforementioned one.

I see :frowning:
I haven't seen this error before, so debug logs would definitely be helpful.

I'd also recommend upgrading filebeat to 6.x. The publishing pipeline and retry handling was rewritten and made more robust in 6.0.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.