Any default processor running in Filebeat?

Hi,

I was reading this Github thread, and the writer mentioned

  • running netflow/filebeat through agent (aka default processors enabled) increases each event by 80% in size. Thus a good performance optimisation, when wire transfer speed to Elasticsearch is limited, is to disable some of the processors?!

This got me wondering if there is any processor running in Filebeat (v8.8.0) automatically? I'm also trying to optimize my Filebeat/netflow performance, and currently, there is only a drop_fields in the processors section in my filebeat.yml. But I noticed that my records include the community_id field, which is a processor that I do not see explicitly enabled in my Filebeat configuration.

So my question is - is there a way to find out what processors are running automatically in Filebeat, as I might want to disable some of them to improve the performance.

Thank you.

While the community_id processor is used for modules like zeek, the generation of the community_id is done directly in the Netflow Input.

Filebeat largely should not be invoking processors that are not specified in the module configuration that you're using so the first step is to look at the module config / dataset config located ./module/zeek/connection/config/connection.yml

For the defined processors:


processors:
  - drop_fields:
      fields: ["json.orig_bytes","json.resp_bytes","json.tunnel_parents"]
      ignore_missing: true
  - rename:
      fields:
        - from: "json"
          to: "zeek.connection"

        - from: "zeek.connection.duration"
          to: "temp.duration"

        - from: "zeek.connection.id.orig_h"
          to: "source.address"

        - from: "zeek.connection.id.orig_p"
          to: "source.port"

        - from: "zeek.connection.id.resp_h"
          to: "destination.address"

        - from: "zeek.connection.id.resp_p"
          to: "destination.port"

        - from: "zeek.connection.proto"
          to: "network.transport"

        - from: "zeek.connection.service"
          to: "network.protocol"

        - from: "zeek.connection.uid"
          to: "zeek.session_id"

        - from: "zeek.connection.orig_ip_bytes"
          to: "source.bytes"

        - from: "zeek.connection.resp_ip_bytes"
          to: "destination.bytes"

        - from: "zeek.connection.orig_pkts"
          to: "source.packets"

        - from: "zeek.connection.resp_pkts"
          to: "destination.packets"

        - from: "zeek.connection.conn_state"
          to: "zeek.connection.state"

        - from: "zeek.connection.orig_l2_addr"
          to: "source.mac"

        - from: "zeek.connection.resp_l2_addr"
          to: "destination.mac"

      ignore_missing: true
      fail_on_error: false

  - rename:
      when.equals.network.transport: icmp
      fields:
        - from: "source.port"
          to: "zeek.connection.icmp.type"

        - from: "destination.port"
          to: "zeek.connection.icmp.code"

      ignore_missing: true
      fail_on_error: false
  - convert:
      fields:
        - {from: "zeek.session_id", to: "event.id"}
        - {from: "source.address", to: "source.ip", type: "ip"}
        - {from: "destination.address", to: "destination.ip", type: "ip"}
      ignore_missing: true
      fail_on_error: false
  - add_fields:
      target: event
      fields:
        kind: event
        category:
          - network
  - if:
      equals.network.transport: icmp
    then:
      community_id:
        fields:
          icmp_type: zeek.connection.icmp.type
          icmp_code: zeek.connection.icmp.code
    else:
      community_id:
{{ if .internal_networks }}
  - add_network_direction:
      source: source.ip
      destination: destination.ip
      target: network.direction
      internal_networks: {{ .internal_networks | tojson }}
{{ end }}
  - add_fields:
      target: ''
      fields:
        ecs.version: 1.12.0

That being said, if you invoke the beat with debug logging enabled ( ./filebeat -e -d "*" run)it will tell you the processors instantiated:

Generated new processors: drop_fields={\"Fields\":[\"json.orig_bytes\",\"json.resp_bytes\",\"json.tunnel_parents\"],\"RegexpFields\":[],\"IgnoreMissing\":true}, rename=[{From:json To:zeek.connection} {From:zeek.connection.duration To:temp.duration} {From:zeek.connection.id.orig_h To:source.address} {From:zeek.connection.id.orig_p To:source.port} ...

Full log:

{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.919-0500","log.logger":"cfgfile","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Run","file.name":"cfgfile/reload.go","file.line":212},"message":"Number of module configs found: 1","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.919-0500","log.logger":"reload","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload","file.name":"cfgfile/list.go","file.line":93},"message":"Starting reload procedure, current runners: 0","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.919-0500","log.logger":"reload","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload","file.name":"cfgfile/list.go","file.line":111},"message":"Start list: 1, Stop list: 0","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-08-08T11:15:58.921-0500","log.logger":"modules","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/fileset.newModuleRegistry","file.name":"fileset/modules.go","file.line":136},"message":"Enabled modules/filesets: zeek (connection)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.922-0500","log.logger":"input","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/log.(*config).resolveRecursiveGlobs","file.name":"log/config.go","file.line":207},"message":"recursive glob enabled","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.922-0500","log.logger":"conditions","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/conditions.NewCondition","file.name":"conditions/conditions.go","file.line":98},"message":"New condition equals: map[network.transport:0x10311dee0]","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.922-0500","log.logger":"conditions","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/conditions.NewCondition","file.name":"conditions/conditions.go","file.line":98},"message":"New condition equals: map[network.transport:0x10311dee0]","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.922-0500","log.logger":"processors","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors.New","file.name":"processors/processor.go","file.line":114},"message":"Generated new processors: community_id=[target=network.community_id, fields=[source_ip=source.ip, source_port=source.port, destination_ip=destination.ip, destination_port=destination.port, transport_protocol=network.transport, icmp_type=zeek.connection.icmp.type, icmp_code=zeek.connection.icmp.code], seed=0]","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.923-0500","log.logger":"processors","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors.New","file.name":"processors/processor.go","file.line":114},"message":"Generated new processors: community_id=[target=network.community_id, fields=[source_ip=source.ip, source_port=source.port, destination_ip=destination.ip, destination_port=destination.port, transport_protocol=network.transport, icmp_type=icmp.type, icmp_code=icmp.code], seed=0]","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.923-0500","log.logger":"processors","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors.New","file.name":"processors/processor.go","file.line":114},"message":"Generated new processors: drop_fields={\"Fields\":[\"json.orig_bytes\",\"json.resp_bytes\",\"json.tunnel_parents\"],\"RegexpFields\":[],\"IgnoreMissing\":true}, rename=[{From:json To:zeek.connection} {From:zeek.connection.duration To:temp.duration} {From:zeek.connection.id.orig_h To:source.address} {From:zeek.connection.id.orig_p To:source.port} {From:zeek.connection.id.resp_h To:destination.address} {From:zeek.connection.id.resp_p To:destination.port} {From:zeek.connection.proto To:network.transport} {From:zeek.connection.service To:network.protocol} {From:zeek.connection.uid To:zeek.session_id} {From:zeek.connection.orig_ip_bytes To:source.bytes} {From:zeek.connection.resp_ip_bytes To:destination.bytes} {From:zeek.connection.orig_pkts To:source.packets} {From:zeek.connection.resp_pkts To:destination.packets} {From:zeek.connection.conn_state To:zeek.connection.state} {From:zeek.connection.orig_l2_addr To:source.mac} {From:zeek.connection.resp_l2_addr To:destination.mac}], rename=[{From:source.port To:zeek.connection.icmp.type} {From:destination.port To:zeek.connection.icmp.code}], condition=equals: map[network.transport:0x10311dee0], convert={\"Fields\":[{\"From\":\"zeek.session_id\",\"To\":\"event.id\",\"Type\":\"[unset]\"},{\"From\":\"source.address\",\"To\":\"source.ip\",\"Type\":\"ip\"},{\"From\":\"destination.address\",\"To\":\"destination.ip\",\"Type\":\"ip\"}],\"Tag\":\"\",\"IgnoreMissing\":true,\"FailOnError\":false,\"Mode\":\"copy\"}, add_fields={\"event\":{\"category\":[\"network\"],\"kind\":\"event\"}}, if equals: map[network.transport:0x10311dee0] then community_id=[target=network.community_id, fields=[source_ip=source.ip, source_port=source.port, destination_ip=destination.ip, destination_port=destination.port, transport_protocol=network.transport, icmp_type=zeek.connection.icmp.type, icmp_code=zeek.connection.icmp.code], seed=0] else community_id=[target=network.community_id, fields=[source_ip=source.ip, source_port=source.port, destination_ip=destination.ip, destination_port=destination.port, transport_protocol=network.transport, icmp_type=icmp.type, icmp_code=icmp.code], seed=0], networkDirection=source.ip|destination.ip->network.direction, add_fields={\"ecs\":{\"version\":\"1.12.0\"}}","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-08-08T11:15:58.923-0500","log.logger":"input","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/log.(*Input).loadStates","file.name":"log/input.go","file.line":188},"message":"exclude_files: [(?-ms:.gz$)]. Number of states: 0","service.name":"filebeat","input_id":"a5f4eb7b-3f1e-4516-8478-a50ac0482057","ecs.version":"1.6.0"}

I don't see any other processor being generated, other than the drop_fields that I have specified in my filebeat.yml. I guess that should mean that no other processor was started. Thanks!