Filebeat data loss with file rotation and elasticsearch not reachable


(AN) #1

I am seeing loss of data in below scenario :

  • filebeat is sending data to elasticsearch from a json file
  • elasticsearch is down
  • multiple ( say 5 ) files are created during this time due to file rotation policy. files are named like below :
    log.json ( actual input file for filebeat )
    arch_2017-10-17.1.json ( format : arch_{date}.{number}.json )
    arch_2017-10-17.2.json
    arch_2017-10-17.3.json
    arch_2017-10-17.4.json

When I restarted elasticsearch, and checked Kibana I could only see the entries from log.json and arch_2017-10-17.1.json. Entries from rest of the files are missing.

Filebeat.yml :
filebeat.prospectors:

  • input_type: log

    paths:

    • C:\Users\log.json
      document_type: json
      json.keys_under_root: true
      json.add_error_key: true
      json.overwrite_keys: true

    scan_frequency: 1s
    fields_under_root: true
    close_inactive: 24h
    clean_removed: false
    close_removed: false

output.elasticsearch:
hosts: ["localhost:9200"]
index: "elastbeat-%{+yyyy.MM.dd}"
pipeline: "filebeat_pipeline"
bulk_max_size: 2048

processors:

  • drop_fields:
    fields: ["beat.version","input_type","offset", "tags"]

What may be reason for the entries not seen in Kibana?


(Andrew Kroh) #2

If you change the paths to also match the rotated file names then on restart Filebeat will detect that the file has been rotated and resume from the last offset.

You can add a second path like 'C:\Users\arch*.json' and that should resolve the issue.


(AN) #3

I will try this out. I was just wondering if this would result in duplicate events being sent to elasticsearch.


(Andrew Kroh) #4

It should not result in duplicate events. Filebeat is designed to handle log rotation. It knows that the file is the same even though the name has changed because it follows the underlying inode information.


(AN) #5

@andrewkroh Thanks for the suggestion. I have tried including the arch-* glob in the path and I can see the logs are published to elasticsearch. Haven't observed any duplicate entries so far.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.