Grok Pattern not matching in Logstash, but matches in Grok Debuggger

Hi,
I have this grok pattern:

(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})%{SPACE}%{LOGLEVEL:level}%{SPACE}%{NUMBER:pid} --- \[%{SPACE}(?<thread>[A-Za-z0-9-]+)\] [A-Za-z0-9.]*\.(?<class>[A-Za-z0-9#_]+)\s*:%{SPACE}(\{%{GREEDYDATA:fields.CorrelationId}\})?:(\{%{GREEDYDATA:fields.proTokenBalance}\}%{SPACE})?\s%{GREEDYDATA:logmessage}

I am matching this log line:

2019-08-13 11:25:39,454 DEBUG 9772 --- [ scheduling-1] c.p.r.s.TokenService : :{10248780001247} The PRO Token balance is 102487.80001247

When I start the ELK stack, it gives me _grokparsefailure, but it it being parsed in the grok debugger tool.

Would someone help me with this issue?

With that pattern and that message I get

"fields.proTokenBalance" => "10248780001247",
            "@timestamp" => 2019-08-13T13:14:52.224Z,
                 "class" => "TokenService",
            "logmessage" => "The PRO Token balance is 102487.80001247",
                 "level" => "DEBUG",
                   "pid" => "9772",
                "thread" => "scheduling-1"

Are you sure there are single spaces around the ---?

Hi, thank you for your reply.

Yes I am sure about the single spaces around the ---

The logging was working fine until I added the extra field proTokenBalance.

Here's my output in the grok debugger console:

{
"level": "INFO",
"logmessage": "The PRO Token balance is 271.99996582",
"pid": "5684",
"thread": "scheduling-1",
"fields": {
"proTokenBalance": "271.99996582"
},
"class": "AccountBalanceCheckService",
"timestamp": "2019-08-14 11:25:11,853"
}

I have a test config with this logstash config:

input {
stdin {}
}
output { stdout { codec => rubydebug } }
filter {
grok {
match => {
"message" => "(?%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})%{SPACE}%{LOGLEVEL:level}%{SPACE}%{NUMBER:pid} --- [%{SPACE}(?[A-Za-z0-9-]+)] [A-Za-z0-9.].(?[A-Za-z0-9#_]+)\s:%{SPACE}({%{GREEDYDATA:fields.CorrelationId}})?:({%{GREEDYDATA:fields.proTokenBalance}}%{SPACE})?\s%{GREEDYDATA:logmessage}"
}
}
}

This is how I execute the command:

echo '2019-08-13 11:25:39,454 DEBUG 9772 --- [ scheduling-1] c.p.r.s.TokenService : :{10248780001247} The PRO Token balance is 102487.80001247' | C:\logstash\bin\logstash.bat -f .\test.config

it outputs me the correct result as well:

{
"logmessage" => "The PRO Token balance is 102487.80001247\r",
"message" => "2019-08-13 11:25:39,454 DEBUG 9772 --- [ scheduling-1] c.p.r.s.TokenService : :{10248780001247} The PRO Token balance is 102487.80001247\r",
"thread" => "scheduling-1",
"pid" => "9772",
"timestamp" => "2019-08-13 11:25:39,454",
"class" => "TokenService",
"@timestamp" => 2019-08-14T08:57:06.094Z,
"@version" => "1",
"host" => "Manol",
"level" => "DEBUG",
"fields.proTokenBalance" => "10248780001247"
}

I forwarded the request to the elasticsearch also as well and in Kibana I saw the field.

But when I start reading the log through the file, I use this configuration:

input {

file {
    path => 'C:\projects\Propy.Ethereum.Rest\logs\test\log.log'
    start_position => "beginning"
}

}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => "localhost:9200"
user => "elastic"
password => "changeme"

}

}
filter {
grok {
match => {
"message" => "(?%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})%{SPACE}%{LOGLEVEL:level}%{SPACE}%{NUMBER:pid} --- [%{SPACE}(?[A-Za-z0-9-]+)] [A-Za-z0-9.].(?[A-Za-z0-9#_]+)\s:%{SPACE}({%{GREEDYDATA:fields.CorrelationId}})?:({%{GREEDYDATA:fields.proTokenBalance}}%{SPACE})?\s%{GREEDYDATA:logmessage}"
}
}
}

And I get this result in Kibana:

{
"_index": "logstash-2019.08.13-000001",
"_type": "_doc",
"_id": "j5Fij2wBxXzNBooJFSk2",
"_version": 1,
"_score": null,
"_source": {
"message": "2019-08-14 12:07:44,055 INFO 14732 --- [ scheduling-1] c.p.r.s.AccountBalanceCheckService : :{271.99996582} The PRO Token balance is 271.99996582\r",
"@timestamp": "2019-08-14T09:07:45.471Z",
"path": "/tmp/log.log",
"host": "3330d93c90be",
"@version": "1",
"tags": [
"_grokparsefailure"
]
},
"fields": {
"@timestamp": [
"2019-08-14T09:07:45.471Z"
]
},
"sort": [
1565773665471
]
}

As you can see, it gives me _grokparsefailure on this line.

The only difference between the file and the stdin is that the log file has multiline log and the logstash configuration doesn't log multiline. I am not sure is this somehow related.

What is your opinion?

I get an unmatched parenthesis error with that grok filter. Please edit your post, select the configuration and click on </>. Then verify that the configuration displayed in the preview panel actually matches your configuration.

1 Like

This is a pastebin of the configuration:

https://pastebin.com/KVwKgMDQ

This is a past bin of a part of my log file:
https://pastebin.com/nuE23JFJ

Thank you for your help,
Regards,
Manol.

In 7.3.0 that gets me

{
             "timestamp" => "2019-08-14 16:28:23,667",
                 "class" => "AccountBalanceCheckService",
                 "level" => "INFO",
"fields.proTokenBalance" => "271.99996582",
            "logmessage" => "The PRO Token balance is 271.99996582",
               "message" => "2019-08-14 16:28:23,667 INFO  1848 --- [   scheduling-1] c.p.r.s.AccountBalanceCheckService         : :{271.99996582} The PRO Token balance is 271.99996582",
                   "pid" => "1848",
                "thread" => "scheduling-1"
}

I am running on UNIX, so I do not have that \r at the end of line, but I would expect GREEDYDATA to consume that.

Not sure if it matters but in your pipeline why do you have the order as:

  1. Input
  2. Output
  3. Filter

As for your patterns, they seem to match properly on the debugger.
https://grokdebug.herokuapp.com/

Well, this is confusing for me :slight_smile:

I have tried everything which I've remembered. It seems like the configuration is okay, but somehow it doesn't work.

We have java stack traces in the log, so a part of the log becomes multi line, this could be the issue (and the logstash configuration doesn't handle multi line log at this moment)

I will play a bit more with my local ELK installation.
Regards,
Manol.

Actually it doesn't, when I start logstash it founds properly the file path, it initialize the connection to Elasticsearch also as well.