Few messages go unparsed from Logstash into Elastic without any failure

I have deployed Filebeat -> Logstash -> ElasticSearch -> Kibana stack for log monitoring in my testing environment. I have two GROK patterns specified in the Logstash configuration. The problem is that few of the messages go unparsed (without applying GROK filter) from Logstash to ElasticSearch and that too without any parse failure exception. Also, the filebeat parameters like beat.name,beat.hostname and source are missing in the final message to ElasticSearch.

The same GROK patterns did not give any problem when I was using ELK stack without Filebeat. So, I am not sure if it has something to do with Filebeat or Logstash.

Note : The unparsed messages are less in number in comparison to total number of messages [1K in 1Million log lines]

means they are some how successful, without seeing your configs and the data it will be almost impossible to tell you what is going on.

What I like to do is - on every grok statement (or group of groks) is add a tag so that I can see which filters it actually hit.

add_tag => [ "rule-%{type}-999} ] this way I know where it was. I don't get a lot of tags because I wrap every type of grok with if [type] =~ /apache/ this way only my Apache groks are used.

If you can post some of your code and data we could try to figure it out

you could also go to https://grokdebug.herokuapp.com/ to help you determine why the grok is not parsing your data

Thanks for the info.

Here are my two GROK patterns (in the order as specified in Logstash Conf file) along with input which goes unparsed :

%{WEBLOGIC_TIMESTAMP:logTimestamp} activityId:%{DATA:activityId}, parentActivityId:%{DATA:parentActivityId}, processId:%{DATA:processId}, userId:%{DATA:userId} %{DATA:processInfo} %{LOGLEVEL:log-level} %{DATA:aop}(?<ControllerName>([a-zA-Z]+Controller))(?<CGLIB>(\$\$EnhancerBySpringCGLIB\$\$[a-z0-9]{7,8}))(\.)?(?<MethodName>([a-zA-Z0-9]*))(?<ObjectName>(\(.*\)))(?<ElapseStart>(.*Elapsed time\s*))(?<ElapsedTimeMs>(\d+*))(?<TimeUnit>(\s*milliseconds.))%{GREEDYDATA:message}

%{WEBLOGIC_TIMESTAMP:logTimestamp} activityId:%{DATA:activityId}, parentActivityId:%{DATA:parentActivityId}, processId:%{DATA:processId}, userId:%{DATA:userId} %{DATA:processInfo} %{LOGLEVEL:log-level} %{GREEDYDATA:message}

WEBLOGIC_TIMESTAMP is a custom pattern here :

WEBLOGIC_TIMESTAMP (?:0?[1-9]|1[0-2])[/-](?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])[/-](?:\d\d){1,2} (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])

Input Log Message :

03/31/2017 10:40:10,275 activityId:8498a737-145e-b9ae-dd4b-36fef0522c59, parentActivityId:, processId:23205@RCOVLNX3085, userId:ftsuser [[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'] INFO com.fti.di.dashboardservice.aspect.LoggingAspect  - method: EventQueryController$$EnhancerBySpringCGLIB$$7c0ebc09.getNotifications() finished. Elapsed time 98 milliseconds.

I have also checked my GROK pattern in the GROK Debugger link specified and the message is parsed successfully for both GROK patterns.

Also, attached is the screenshot of the Log message in Kibana and Logstash config file -

Neither Regex works on that string,

The first Regex looks to match the string the most but

parentActivityId:%{DATA:parentActivityId}, processId:%{DATA:processId},

does not match

parentActivityId:, processId:23205@RCOVLNX3085,

you need something like

parentActivityId:(|%{DATA:parentActivityId}), processId:%{DATA:processId},

please use grokdebugger and test out you Regex and if you still have problems after it working then I can see if I can assist further.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.