Filebeat - Send Only File Metadata Without Reading Content

Hello,

I am trying to configure Filebeat to send only the metadata of encountered files, specifically the [log][file][path] field.

My Requirements:

  • I only need the file path ([log][file][path]), not the content.
  • I do not want Filebeat to send the file’s content.
  • I am not parsing the file content in Logstash.
  • I will only process [log][file][path] in Logstash to generate statistics later in Kibana
  • Ideally, I want one event per file detected in the configured path. But if Filebeat sends the content, I end up with one event per line in each file, which is not what I need.

Issue Faced:

I have tried various configurations (close_eof, exclude_lines, multiline filters, etc.), but either:

  1. Filebeat does not send anything at all, or
  2. Filebeat still sends file content, which I don’t need.

Question:

Is there a way to configure Filebeat to only send file metadata (like [log][file][path]), ensuring that I receive exactly one event per detected file, without processing its content?

Current Configuration:

Filebeat:

filebeat.inputs:
  - type: filestream
    enabled: true
    paths:
      - /data/EDT/files/*/application/efluidconnect-*/files/*/*/*/ERREUR/*
      - /data/EDT/files/*/application/efluidconnect-*/files/*/*/ERREUR/*
    fields:
      log_type: "supervision_efluidconnect"

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

output.logstash:
  hosts: ["localhost:5044"]

processors:
  - drop_fields:
      fields: ["agent", "input", "host", "ecs", "@version", "event", "message"]

Logstash:

input {
  beats {
    port => 5044
  }
}
filter {
  if ([fields][log_type] == "supervision_efluidconnect" ) {
    grok {
      match => { "[log][file][path]" => [ "^%{DATA:PATH}$" ] }
    }
    grok {
      match => { "[PATH]" => [ "^/data/EDT/files/%{WORD:ENVIRONNEMENT}/application/%{NOTSPACE:APPLICATION}/files/%{WORD:IN_OR_OUT}/%{WORD:INTERFACE}/ERREUR/%{DATA:FILENAME}$",
                               "^/data/EDT/files/%{WORD:ENVIRONNEMENT}/application/%{NOTSPACE:APPLICATION}/files/%{WORD:IN_OR_OUT}/%{WORD:INTERFACE}/%{WORD:SOUS_INTERFACE}/ERREUR/%{DATA:FILENAME}$"
                             ]
      }
      tag_on_failure => [ "_grok_efluidconnect_match_pattern_file_nomatch" ]
      add_tag        => [ "_grok_efluidconnect_success" ]
    }
    mutate {
      remove_field => [ "fields", "@version", "log" ]
    }
  }
}
output {
  elasticsearch {
    hosts => "http://localhost:9200"
    index => "%{[fields][log_type]}"
  }
  stdout { codec => rubydebug }
}

Thanks for your help!

Could you use auditbeat for this?

Auditbeat has a feature called file integrity monitoring which can be configured to produce events for initial scan, file creation, and file modification.

If you filtered to only produce events on initial scan and during creation you would get exactly one event per created file.

You'll also probably want to disable the file hashing as well.