Filebeat reading entire mounted azure file from start

dthemg · April 27, 2021, 2:09pm

Hi

I have an Azure file that I want to monitor using FileBeat. I am using the helm chart for this purpose, with the following values.yaml:

nodeSelector: { beta.kubernetes.io/os: linux }

  extraVolumes:
    - name: module-that-produces-logs
      persistentVolumeClaim:
        claimName: module-that-produces-logs

  extraVolumeMounts:
    - name: module-that-produces-logs
      mountPath: /mnt/log/module-that-produces-logs
      readOnly: true

  filebeatConfig:
    filebeat.yml: |
      output.elasticsearch:
        hosts: ["elasticsearch-master.logging.svc.cluster.local:9200"]
      filebeat.inputs:
        - type: log
          file_identity.path: ~
          enabled: true
          paths:
          - /mnt/log/*/*/*.log
          - /mnt/log/*/*.log
          - /mnt/log/*/*/*.txt
            - /mnt/log/*/*.txt

However, I am experiencing two issues that are causing me a headache:

Filebeat sends the entire .log file contents instead of just the latest changes to elasticsearch. I hoped that this would be resolved by using file_identity.path, but no such luck. Any advice on how to move forward would be appreciated.
This configuration creates two pods of filebeat, which subsequently mounts two instances of the azure file and sends two duplicates to elasticsearch. How do I reduce to just one pod? I cant seem to find this option in the filebeat helm chart values.yaml.

Thanks in advance,
/David

kvch · April 28, 2021, 8:49am

How exactly did you change file_identity? You first had default file_identity and then run Filebeat a couple of times then changed it?

Migration from one file_identity setting to another one can be rocky if you have multiple entries for the same path. Based on the behaviour of Filebeat you are describing it is the case. I suggest you remove the registry file and start with a clean state with file_identity.path.

dthemg · April 28, 2021, 11:43am

Hi! Thanks for answering. Yes that is pretty much what I did.

I tried to delete the registry-directory (/usr/share/filebeat/data/registry) on both nodes I am running Filebeat on, but I am still getting the same error unfortunately. For instance writing A then B then C into an empty test log file produces this output in kibana:

A <-correct
B <- correct
A <- A - B are the full log contents
B 
C <- correct
A <- A - B - C are the full log contents
B
C

dthemg · April 30, 2021, 12:23pm

Hi again

It is plausible that this issue happened due to the Azure file also being mounted on a Windows node - and that writing to a file from Windows somehow changes it leading to the whole file being recognized as new in Filebeat (like line-endings or something).

system · May 28, 2021, 2:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problems with error "File was truncated. Begin reading file from offset 0" Beats filebeat	5	1421	August 26, 2021
Filebeat not able to read logs from kubernetes pod Beats filebeat	0	123	June 6, 2024
Use logstash or filebeat for sending azure JSON logs? Beats filebeat	16	3727	March 29, 2017
Filebeat not read log, log msg "File didn't change" Beats filebeat	2	450	June 22, 2022
Azure Filebeat issues with dashboard setup and data ingestion Beats filebeat	14	1421	August 10, 2021

Filebeat reading entire mounted azure file from start

Related topics