Grok pattern DATA vs GREEDYDATA mismatch

In one log file, I have two different formats of log lines as below. Why does %{GREEDYDATA:loglevel} and %{DATA:loglevel} make a huge difference in loglevel output?

2020-03-26 11:31:10,324 [Thread-40]             INFO   o.e.j.s.AbstractConnector - Stopped ServerConnector@676505de{HTTP/1.1,[http/1.1]}{0.0.0.0:8780} 

%{DATESTAMP:timestamp} \[%{DATA:thread}\]  %{GREEDYDATA:loglevel}  %{JAVACLASS:javaClass} %{GREEDYDATA:logmessage}

{
  "javaClass": "o.e.j.s.AbstractConnector",
  "loglevel": "            INFO ",
  "logmessage": " - Stopped ServerConnector@676505de{HTTP/1.1,[http/1.1]}{0.0.0.0:8780}",
  "thread": "Thread-40",
  "timestamp": "20-03-26 11:31:10,324"
}

2020-03-26 03:36:21,546 [DispatcherScheduler_Worker-1] INFO   o.a.c.h.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect 

%{DATESTAMP:timestamp} \[%{DATA:thread}\] %{DATA:loglevel} %{JAVACLASS:javaClass} %{GREEDYDATA:logmessage}

{
  "javaClass": "o.a.c.h.HttpMethodDirector",
  "loglevel": "INFO ",
  "logmessage": " - I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect",
  "thread": "DispatcherScheduler_Worker-1",
  "timestamp": "20-03-26 03:36:21,546"
}
2020-03-26 11:31:10,324 [Thread-40]             INFO   o.e.j.s.AbstractConnector - Stopped ServerConnector@676505de{HTTP/1.1,[http/1.1]}{0.0.0.0:8780} 

%{DATESTAMP:timestamp} \[%{DATA:thread}\] %{DATA:loglevel}  %{JAVACLASS:javaClass}%{GREEDYDATA:logmessage}

{
  "javaClass": "INFO",
  "loglevel": "          ",
  "logmessage": "   o.e.j.s.AbstractConnector - Stopped ServerConnector@676505de{HTTP/1.1,[http/1.1]}{0.0.0.0:8780}",
  "thread": "Thread-40",
  "timestamp": "20-03-26 11:31:10,324"
}

Because of this, the data in Kibana parsed sometimes shows Loglevel and sometimes not-

Timestamp: Mar 26, 2020 @ 15:13:42.950	
JavaClass: VER.jks	
LogLevel: ER	
Message: 2020-03-26 11:31:39,799 [WrapperSimpleAppMain]  INFO   o.e.j.u.s.SslContextFactory - x509=X509@706a432c(rmmca,h=[],w=[]) for SslContextFactory@7f4f185d(file:///C:/Program%20Files%20(x86)/ESQ%20SST/Certificates_ESQ/SERVER.jks,null)

Timestamp:       Mar 26, 2020 @ 15:13:42.950	
JavaClass: o.e.j.s.AbstractConnector	
LogLevel: INFO	
Message: 2020-03-26 11:31:39,821 [WrapperSimpleAppMain]  INFO   o.e.j.s.AbstractConnector - Started ServerConnector@665c31ea{SSL,[ssl, http/1.1]}{0.0.0.0:8782}

I suggest you make 2 filter using | (pipe) to differentiate the log format.
It's complicated if you use GREEDYDATA into single grok filter.
Before go to the filter building into production, try this URL
https://grokconstructor.appspot.com/do/match to match it first

Regards,
Fadjar Tandabawana

Thanks, while I looked into it, I saw it was used only for log formats which had pipes in logs. like this-

2020-03-30 06:15:23.773	IncidentAgent	5980	Information	Processing next batch of scheduled incident activity records	

But I dont have this. Is there a way I can just slice the information I want and get that?

and

is very different, you need to build several grok matching using if match then do the grok specific for that format

The second one is no "[" character, you need to build matching for this

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.