Filebeat configuration in ECE on-premises cluster does not grok the message field

Hello,

The ECE installation comes with a default logging-and-metrics cluster which is filled with data by filebeat, metricbeat and allocator-metricbeat.

I see in kibana that the message field is not being grokked for various components and the wrong timestamp is being used as a result.

filebeat.yml is located under :
/services/beats-runner/managed/filebeat/filebeat.yml

When I open this file I see the following :

# Auto-generated by service.jar on container start 
# Source /scala-services/beats-runner/src/main/resources/services/filebeat/template

I would not mind to create the necessary grok filters myself, but I need to know a way to edit the template used.

Could you please explain how this can be achieved?

Thanks.

Unfortunately it's not possible to edit the filebeat template

I think the only thing you can do is to create a separate index with an ingest pipeline and then periodically reindex the raw data into that :frowning:

I'll have a dig around and try to figure out why we haven't fixed this (I'm sure I remember it being discussed ages ago)

Thanks for the feedback, it's much appreciated

Hello Alex,

Can't I setup the pipeline on the same index?

Is the index template fixed and strict?
I don't really care at this point for the older data..

We don't recommend making changes to the system data flows because of the risk of breaking downstream dependencies, upgrades etc.

(The filebeat and index templates are embedded in the docker image. The templates and mappings that are loaded into the cluster can of course be edited, but may be overwritten during cluster or ECE upgrades)

I understand.

For now i have altered the index template service-logs-6.3.2 and added "default_pipeline": "ece-logs"

The pipeline looks as follows :

PUT _ingest/pipeline/ece-logs
{
  "description": "parse ECE logs",
  "processors": [
    {
      "set": {
        "field": "original_message",
        "value": "{{message}}"
      }
    },
    {
      "grok": {
        "field": "message",
        "patterns": [
          "\\[%{TIMESTAMP_ISO8601:timestamp}\\]\\[%{LOGLEVEL:level}\\s*\\]\\[%{DATA:logger}\\]?\\s*%{GREEDYDATA:message}",
          "%{TIMESTAMP_ISO8601:timestamp}?\\s*%{LOGLEVEL:level}?\\s*\\[%{DATA:logger}\\] ?\\s*%{GREEDYDATA:message}",
          "{\\\"level\\\":\\\"%{LOGLEVEL:level}\\\",\\\"log\\\":\\\"%{DATA:logger}\\\",\\\"@timestamp\\\":\\\"%{TIMESTAMP_ISO8601:timestamp}\\\",\\\"message\\\":\\\"*%{GREEDYDATA:message}\\\"}"
        ]
      }
    },
    {
      "uppercase": {
        "field": "level"
      }
    },
    {
      "date": {
        "field": "timestamp",
        "formats": [
          "ISO8601",
          "yyyy-MM-dd HH:mm:ss,SSS",
          "yyyy-MM-ddTHH:mm:ss,SSS"
        ],
        "timezone": "Europe/Amsterdam"
      }
    },
    {
      "remove": {
        "field": [
          "timestamp",
          "original_message"
        ]
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    }
  ]
}

Looks good so far :slight_smile :slight_smile:

If something goes wrong then original_message is quite helpfull for debugging.
If all is OK then timestamp and original_message are removed.

I suppose we could PUT this pipeline and alter the index template using cron or some other mechanism.
It is not super clean but I prefer not parsing some documents than not parsing any at all.

I will keep you posted for additions to the other -logs- indices.

Cheers!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.