Filebeat reading entire mounted azure file from start

Hi

I have an Azure file that I want to monitor using FileBeat. I am using the helm chart for this purpose, with the following values.yaml:

nodeSelector: { beta.kubernetes.io/os: linux }

  extraVolumes:
    - name: module-that-produces-logs
      persistentVolumeClaim:
        claimName: module-that-produces-logs

  extraVolumeMounts:
    - name: module-that-produces-logs
      mountPath: /mnt/log/module-that-produces-logs
      readOnly: true

  filebeatConfig:
    filebeat.yml: |
      output.elasticsearch:
        hosts: ["elasticsearch-master.logging.svc.cluster.local:9200"]
      filebeat.inputs:
        - type: log
          file_identity.path: ~
          enabled: true
          paths:
          - /mnt/log/*/*/*.log
          - /mnt/log/*/*.log
          - /mnt/log/*/*/*.txt
            - /mnt/log/*/*.txt

However, I am experiencing two issues that are causing me a headache:

  1. Filebeat sends the entire .log file contents instead of just the latest changes to elasticsearch. I hoped that this would be resolved by using file_identity.path, but no such luck. Any advice on how to move forward would be appreciated.
  2. This configuration creates two pods of filebeat, which subsequently mounts two instances of the azure file and sends two duplicates to elasticsearch. How do I reduce to just one pod? I cant seem to find this option in the filebeat helm chart values.yaml.

Thanks in advance,
/David

How exactly did you change file_identity? You first had default file_identity and then run Filebeat a couple of times then changed it?

Migration from one file_identity setting to another one can be rocky if you have multiple entries for the same path. Based on the behaviour of Filebeat you are describing it is the case. I suggest you remove the registry file and start with a clean state with file_identity.path.

Hi! Thanks for answering. Yes that is pretty much what I did.

I tried to delete the registry-directory (/usr/share/filebeat/data/registry) on both nodes I am running Filebeat on, but I am still getting the same error unfortunately. For instance writing A then B then C into an empty test log file produces this output in kibana:

A <-correct
B <- correct
A <- A - B are the full log contents
B 
C <- correct
A <- A - B - C are the full log contents
B
C

Hi again

It is plausible that this issue happened due to the Azure file also being mounted on a Windows node - and that writing to a file from Windows somehow changes it leading to the whole file being recognized as new in Filebeat (like line-endings or something).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.