Issue with Logstash Grok and special character in field

Hello,

I've an issue that I don't understand why I have it...

Here is the example:

Log line to parse:
2017-12-21 20:26:29.253;TEST;5236;10792;General;Information;-1; METHODE=EnvoyerDoc.Mlbx_SendMailSmtpEx RETOUR=0 MEDIA= STATUS=>OK< DEST=10|529758|1|µL_DocumentTypeCRIµ;EXPERT SYST;099999999-GP1L;S00000001;VIA010000;20305;;TOTO + COM|ft-ct@mail.com|EXPERTSYST|

Grok pattern currently used:
%{LOGDATEFORMAT:LogDate}%{SEMICOLON_DELIMITER}%{GREEDYDATA:Machine}%{SEMICOLON_DELIMITER}%{GREEDYDATA:ProcessID}%{SEMICOLON_DELIMITER}%{GREEDYDATA:Win32Thread}%{SEMICOLON_DELIMITER}%{GREEDYDATA:LogCategory}%{SEMICOLON_DELIMITER}%{GREEDYDATA:LogSeverity}%{SEMICOLON_DELIMITER}%{NUMBER:Priority}%{SEMICOLON_DELIMITER}%{GREEDYDATA:LogMessage}

With the two following specific patterns:
SEMICOLON_DELIMITER ;
LOGDATEFORMAT %{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY} %{TIME}

The log line is not correctly parsed by the Grok pattern and here is the result:
{
"LogDate": [
[
"2017-12-21 20:26:29.253"
]
],
"Machine": [
[
"TEST;5236;10792;General;Information;-1; METHODE=EnvoyerDoc.Mlbx_SendMailSmtpEx RETOUR=0 MEDIA= STATUS=>OK< DEST=10|529758|1|µL_DocumentTypeCRIµ"
]
],
"ProcessID": [
[
"EXPERT SYST"
]
],
"Win32Thread": [
[
"099999999-GP1L"
]
],
"LogCategory": [
[
"S00000001"
]
],
"LogSeverity": [
[
"VIA010000"
]
],
"Priority": [
[
"20305"
]
],
"LogMessage": [
[
";TOTO + COM|ft-ct@mail.com|EXPERTSYST| "
]
]
}

The result expected for the field LogMessage is the following:
LogMessage => "METHODE=EnvoyerDoc.Mlbx_SendMailSmtpEx RETOUR=0 MEDIA= STATUS=>OK< DEST=10|529758|1|µL_DocumentTypeCRIµ;EXPERT SYST;099999999-GP1L;S00000001;VIA010000;20305;;TOTO + COM|ft-ct@mail.com|EXPERTSYST|"

Just for information, some lines on my log are like the one present above.
But this Grok pattern works with other lines!

Anyone have an explanation on this point?

Thank you in advance for your help!

Stephane

Well there is absolutely nothing wrong with the grok pattern that I can identify.
Just a thought, have you tried using the csv filter? You can use semicolon as a delimiter in it.
Regards,
N

Don't use more than one DATA or GREEDYDATA pattern in a single grok expression. It's a serious performance killer and is probably the source of your problems. Consider using a csv or dissect filter to split the payload into fields.

Hello

Thank you for your help.
There is some data that I can parse it as WORD instead of GREEDYDATA.

With this workaround everything is fine for me.

All log lines (nominal and special ones) are OK on my Elastic DB.

I put this topic as resolved.

Thans for your help!!!
And see you soon maybe!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.