Here's a multiline log entry I'm using to test:
May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>
INFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com
This is the pattern that matches fine in ( Grok Constructor) :
%{CATALINA_DATESTAMP:jetty_timestamp}%{SPACE}%{JAVACLASS:java_class}%{SPACE}%{JAVAMETHOD:java_method}\n%{LOGLEVEL:log_level}:%{JAVALOGMESSAGE:log_message}
(Don't forget to use the (?m) in the logstash's multline filter edit box)
Here's the ES pipeline:
PUT _ingest/pipeline/filebeat-6.7.2-jetty-log-pipeline
{
"description" : "Ingest pipeline for jetty stderror",
"processors": [
{
"grok": {
"field": "message",
"patterns": ["(?m)%{CATALINA_DATESTAMP:jetty_timestamp}%{SPACE}%{JAVACLASS:java_class}%{SPACE}%{JAVAMETHOD:java_method}\n%{LOGLEVEL:log_level}:%{JAVALOGMESSAGE:log_message}"],
"ignore_missing": false
}
},
{
"remove":{
"field": "message"
}
},
{
"date": {
"field": "jetty_timestamp",
"target_field": "@timestamp",
"formats": ["MMM dd, yyyy HH:mm:ss a", "EE MMM dd HH:mm:ss z yyyy" ]
}
},
{
"remove": {
"field" : "jetty_timestamp"
}
}
],
"on_failure" : [{
"set" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
},
{
"set": {
"field" : "_index",
"value" : "failed-{{ _index }}"
}
}]
}
Here's the test:
POST _ingest/pipeline/filebeat-6.7.2-jetty-log-pipeline/_simulate
{
"docs": [
{
"_source": {
"message":"""
May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>
INFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com
"""
}
}
]
}
Here's the result:
{
"doc" : {
"_index" : "failed-_index",
"_type" : "_type",
"_id" : "_id",
"_source" : {
"message" : "May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>\nINFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com",
"error" : {
"message" : "Provided Grok expressions do not match field value: [May 10, 2019 10:14:48 PM com.company.servers.Server <clinit>\\nINFO: Running vpc ABC, in environment DEV, region is USEAST1, host is https://app.abc.company.com]"
}
},
"_ingest" : {
"timestamp" : "2019-06-20T19:27:39.125Z"
}
}
}
This same grok pattern works fine for all of my other log entries that don't use '<clinit>' as the java method, so I'm thinking there may be a special character issue here with greaterthan and lesserthan?
The grok pattern JAVAMETHOD defined here should consume it:
#Allow special <init>, <clinit> methods
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
Any idea why this isn't working? I shouldn't have to escape anything within those """ blocks, right?
If I replace "%{JAVAMETHOD}" with "%{DATA}" it works, but I shouldn't need to do this?