Grok pattern fails in pipeline, but works in grok builder?

Here's a multiline log entry I'm using to test:

May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>
INFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com

This is the pattern that matches fine in ( Grok Constructor) :

%{CATALINA_DATESTAMP:jetty_timestamp}%{SPACE}%{JAVACLASS:java_class}%{SPACE}%{JAVAMETHOD:java_method}\n%{LOGLEVEL:log_level}:%{JAVALOGMESSAGE:log_message}

(Don't forget to use the (?m) in the logstash's multline filter edit box)

Here's the ES pipeline:

PUT _ingest/pipeline/filebeat-6.7.2-jetty-log-pipeline
{
  "description" : "Ingest pipeline for jetty stderror",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["(?m)%{CATALINA_DATESTAMP:jetty_timestamp}%{SPACE}%{JAVACLASS:java_class}%{SPACE}%{JAVAMETHOD:java_method}\n%{LOGLEVEL:log_level}:%{JAVALOGMESSAGE:log_message}"],
        "ignore_missing": false
      }
    },
    {
      "remove":{
        "field": "message"
      }
    },
    {
      "date": {
        "field": "jetty_timestamp",
        "target_field": "@timestamp",
        "formats": ["MMM dd, yyyy HH:mm:ss a", "EE MMM dd HH:mm:ss z yyyy" ]
      }
    },
    {
      "remove": {
        "field" : "jetty_timestamp"
      }
    }
    ],
      "on_failure" : [{
    "set" : {
      "field" : "error.message",
      "value" : "{{ _ingest.on_failure_message }}"
    }
  },
  {
    "set": {
        "field" : "_index",
        "value" : "failed-{{ _index }}"
      }
  }]
}

Here's the test:

POST _ingest/pipeline/filebeat-6.7.2-jetty-log-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "message":"""
May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>
INFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com
"""
      }
    }
  ]
}

Here's the result:

{
  "doc" : {
    "_index" : "failed-_index",
    "_type" : "_type",
    "_id" : "_id",
    "_source" : {
      "message" : "May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>\nINFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com",
      "error" : {
        "message" : "Provided Grok expressions do not match field value: [May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>\\nINFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com]"
      }
    },
    "_ingest" : {
      "timestamp" : "2019-06-20T19:27:39.125Z"
    }
  }
}

This same grok pattern works fine for all of my other log entries that don't use '<clinit>' as the java method, so I'm thinking there may be a special character issue here with greaterthan and lesserthan?

The grok pattern JAVAMETHOD defined here should consume it:

#Allow special <init>, <clinit> methods
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)

Any idea why this isn't working? I shouldn't have to escape anything within those """ blocks, right?

If I replace "%{JAVAMETHOD}" with "%{DATA}" it works, but I shouldn't need to do this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.