Filebeat not sending all lines for a overwritten file

Hello,
On a daily basis, we are exporting 4 configurations files. We are overwriting the same file with the new data that is pretty much the same every time.
My problem is that filebeat does not realize it is a new file and I'd like it to send the whole file on a daily basis.
Some times, it sends all the data, some times just a subset of it. Right now, I'm deleting the existing file to recreate one (happening in few seconds). I also try to erase the old file and create a new file with a new name (filebeat sees it as a file being renamed).
My latest configuration file looks like this:

  - type: filestream
    id: ConfigurationFile_1
    paths:
      - /var/config/ConfigurationFile_1.csv
    clean_removed: true
    prospector:
      scanner:
        resend_on_touch: true
        fingerprint:
          enabled: true
          offset: 0
          length: 1024
#    file_identity:
#      fingerprint:
#        enabled: true
#        offset: 0
#        length: 1024
    ignore_older: 3m
    fields:
      log_topic: configuration_kafka_topic
    processors:
      - add_fields:
          target: ""
          fields:
            system_name: as002
      - dissect:
          tokenizer: "%{enterpriseId}|%{enterpriseName}|%{groupId}|%{groupName}|%{domainName}|"
          field: "message"
          target_prefix: "SystemConfig"
          trim_values: "all"
      - script:
          lang: javascript
          id: lowercase
          source: >
            function process(event) {
              var groupId = event.Get("SystemConfig.groupId");
              if (groupId != null) {
                event.Put("SystemConfig.groupIdLowerCase", groupId.toLowerCase());
              }
            }
1 Like

Hi @immortel, Welcome to Elastic community.

You can try to set close_inactive.

When this option is enabled, Filebeat closes the file handle if a file has not been harvested for the specified duration. If the closed file changes again, a new harvester is started and the latest changes will be picked up after scan_frequency has elapsed.

Hello @ashishtiwari1993,
The default value for this option is 5 minutes and my file is being updated every 24 hours. I don't know what other value I could put in there that could improve the 5 minutes.