Robust filter to categorise all logs into "log" and "message"

Hello there!

I would like to upgrade our ELK pipeline from 5.6.6 to 6.6.2
I had some problems with ES and Kibana which by now were successfully solved and I finally got to Logstash and Filebeat.
In v5.6.6 we used quite simple filters which served us well and caught most of the logs we needed, but in 6.6.2 they don't work at all.
Events from logstash arrive to ES and searchable in kibana but "log" and "message" fields not present and I got _jsonparsefailure. I haven't found in the documentation any mention of filter deprecation or any other unsupported configuration in 6.6.2 used by our current config.
Basically I would like to be able to write into two main fields - "log" and "message" by filtering filebeat output with as less as possible filters. Filters have to catch all of our logs, system logs (mongo, ES, nginx, kafka etc.) and application logs as well.

What I need to change in the current config in order to get it work?
I am not interested in custom filter for each kind of log/application. Is it possible to configure as robust as possible filter?

The environment is running in kubernetes cluster.

Current filebeat.yml config:

prospectors:
        -
          paths:
            - "/var/log/containers/*.log"
          exclude_files:
            - '[a-zA-Z0-9\.\-]*_kube-system_*'
            - '[a-zA-Z0-9\.\-]*-logstash-*'
            - '[a-zA-Z0-9\.\-]*-filebeat-*'
          symlinks: true
          input_type: log
          close_older: 5m
          force_close_files: true

        -
          paths:
            - "/var/log/containers/*.log"
          input_type: log
          symlinks: true
          close_older: 5m
          force_close_files: true
          multiline:
            # Any line not starting with "[20" (as in [2016-01-06 ...) is part of multiline
            pattern: '^\[20'
            negate: true
            match: after

    output:
      logstash:
        hosts: ["monitor-logstash:5044"]

    shipper:

    logging:
      level: info

logstash.conf:

input {
  beats {
    codec => "json"
    port => 5044
  }
}

filter {

  date {
    match => ["time", "ISO8601"]
    remove_field => "time"
  }

  grok {
    match => { "source" => "/var/log/containers/%{DATA:pod_name}_%{DATA:namespace}_%{GREEDYDATA:container_name}-%{DATA:container_id}.log" }
    remove_field => ["source"]
  }

  if [message] {
    json {
      source => ["message"]
      remove_field => "message"
    }
  }

  if [log] {
    json {
      source => ["log"]
      remove_field => "log"
    }
  }
}

output {
  elasticsearch {
    hosts => ["monitor-es:9200"]
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Thanks!

Update:
I managed to get all of the logs into messages field at least.
The problem with the collection was in the filebeat config. I have changed it to use docker plugin instead of collecting from files:

  filebeat.yml: |-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
        # Any line not starting wint "[20" (as in [2019-04-15 ...) is part of multiline
        multiline.pattern: '^\[20'
        multiline.negate: true
        multiline.match: after
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    # To enable hints based autodiscover, remove `filebeat.config.inputs` configuration and uncomment this:
    #filebeat.autodiscover:
    #  providers:
    #    - type: kubernetes
    #      hints.enabled: true

    output:
      logstash:
        hosts: ["monitor-logstash:5044"]

and inputs config as a configmap mounted into the pod:

  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.