Filter unstructured logs with filebeat before sending to logstash

Hello, I'm new to Elasticsearch and have some questions. Filebeat is fetching way too many logs and for the sake of bandwidth, I want to filter the logs at the edge before sending to logstash. The logs don't have the same pattern, so I doubt if the filbeat dissect can be used to achieve this.

Is there any way to achieve this and send only logs with Error and INFO log levels?

You need to provide more information, please share your filebeat.yml file and sample lines of your log file, both the ones you want to collect and the ones you do not want to collect.

You have the option to exclude lines that you do not want to send based on a regex.

These are samples of the logs below. I want to only send logs that have INFO or in a different case, ERROR as the log Level. I don't want logs like the last two lines. The first 3 lines of logs have different patterns, so I am not sure using dissect would work.

2022-10-05 17:49:24,795] {processor.py:651} INFO - DAG(s) dict_keys(['log_filtered'])

[2022-10-05 17:48:22,005: INFO/ForkPoolWorker-1] Filling up the DagBag from /opt/airflow/dags/population.py

[2022-10-05 17:48:21,662] {population.py:24} INFO - I was executed 

/opt/airflow/dags/population.py 

Stale pidfile exists - Removing it.

filebeat.yml


filebeat.inputs:

- type: filestream

  id: my-filestream-id

  enabled: true

  paths:

    - /home/ubuntu/logs/scheduler/**/*.log

filebeat.config.modules:

  path: /etc/filebeat/modules.d/*.yml

  reload.enabled: false

setup.template.settings:

  index.number_of_shards: 1

output.logstash:

  hosts: ["${logstash_ip}:5044"]

processors:

  - add_host_metadata:

      when.not.contains.tags: forwarded

  - add_cloud_metadata: ~

  - add_docker_metadata: ~

  - add_kubernetes_metadata: ~

  - drop_fields:

      fields: ["agent", "cloud", "ecs", "host", "input", "tags"]

      ignore_missing: true

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.