Logstash _grokparsefailure error

sslv · June 19, 2017, 8:31pm

Hi,
I am trying to parse a simple log file to understand how logstash works.

This is my log format:
2017-06-14 11:17:48 [ad8880] INFO: blah blah blah

And I have built the following grok regex using grok Constructor
%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}

But still I am getting _grokparsefailure as below:
"message":"2017-06-14 11:17:48 [ad8880] INFO: blah blah blah\r","tags":["_grokparsefailure"]}

I tried changing the date format to :
%{YEAR}-%{MONTHNUM}-%{MONTHDAY}%{SPACE}%{TIME}

but still no luck.

here is my config file:

input {
file {
path => "xxxxxx.txt"
start_position => "beginning"
}
}

filter {
grok {
match => { "Message" => "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"}
}
}
output {
file{
path => "xxx.txt"
}
}

pts0 · June 19, 2017, 8:48pm

Hi,

when i just look at

match => { "Message" => "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"}

i think something is not good.

Try to escape [ ] with \[, and \] you probably get more luch

sslv · June 19, 2017, 8:58pm

Hey! Thanks for the reply! Still no luck! there are characters like "" and " ' " in my GREEDYDATA.IS that a problem?

sslv · June 19, 2017, 9:00pm

And also I am expecting output like
logtimestamp: 2017-6-.....
threadID:xyz
loglevel:INFO

etc., will I see that after successfully parsing log without error? or I need to make some changes to config file?

pts0 · June 19, 2017, 9:02pm

no, don't think so

try

http://grokconstructor.appspot.com/do/construction

or

https://grokdebug.herokuapp.com/

are quite good place to get your custom pattern work.

sslv · June 19, 2017, 9:14pm

I tried them.. thats how I got the regex.Is there something to do with \r?

pts0 · June 19, 2017, 9:23pm

are u on windows or linux ?
Windows uses CRLF (\r\n, 0D 0A) line endings while Unix just uses LF (\n, 0A).
just \r is nothing valid by default

If u get something special just set delimiter
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-delimiter

magnusbaeck · June 19, 2017, 9:25pm

Build your grok expression gradually and pay attention when things stop working. Start with %{TIMESTAMP_ISO8601:logsimestamp}. Does that work? Then continue with the next (%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}\[%{WORD:threadID}\]).

sslv · June 19, 2017, 9:28pm

I am on Windows. @pts0

and @magnusbaeck I tried that.. nothing seems to work

pts0 · June 19, 2017, 9:30pm

did you got more that one line ?
try to convert to unix newline ?

sslv · June 19, 2017, 9:35pm

It works if I use this expression
(?(.|\r)*)

and when I try to build on to that, it stops working

And can you please tell me what does conversion to unix newline mean?

pts0 · June 19, 2017, 9:38pm

are u really sure you just have \r and not \r\n , just \r is really not windows

sslv · June 19, 2017, 9:45pm

Yes! I am on windows.

May be since the default delimiter is "\n" (as I didnot set any explicitly),\n is getting chopped off before parsing, hence may be only \r is seen. Just my thought, you should be knowing better.I just started using logstash.

pts0 · June 19, 2017, 10:24pm

then set delimiter to \r\n, should work.
And I m just a user, no expert

sslv · June 19, 2017, 10:40pm

Thanks for the reply. Tried that still no luck!

sslv · June 20, 2017, 12:31am

I am able to overcome \r error. But still not able to get the pattern working.

magnusbaeck · June 20, 2017, 5:37am

Works fine here:

$ cat data
2017-06-14 11:17:48 [ad8880] INFO: blah blah blah
$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => [
      "message",
      "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}\[%{WORD:threadID}\]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"
    ]
  }
}
$ /opt/logstash/bin/logstash -f test.config < data
Settings: Default pipeline workers: 8
Pipeline main started
{
         "message" => "2017-06-14 11:17:48 [ad8880] INFO: blah blah blah",
        "@version" => "1",
      "@timestamp" => "2017-06-20T05:36:58.681Z",
            "host" => "lnxolofon",
    "logsimestamp" => "2017-06-14 11:17:48",
        "threadID" => "ad8880",
        "loglevel" => "INFO",
            "task" => "blah blah blah"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

sslv · June 20, 2017, 6:03am

Thanks for the reply Magnus!
I have a few thoughts why the result might be different:
I performed this test on Windows 10 with "File input plugin". Are there any chances that there can be any problems with EOL characters or with File opening or closing?

I also performed a small test to analyse the issue.I decided to go step by step upon your advice, So I wanted to see if the setup was correct,
My input file contained:
123
456
789

If the grok pattern is
{
match=>{"message",%{NUMBER}} // This gave me _grokparefailure
}

but the pattern
{
match=>{"message",(?[0-9]*)} // did not give me any error
}

but in either case, I was unable to see fields tag in the stdout (Is the fields tag updated only if the grok parsing succeeds?)

magnusbaeck · June 20, 2017, 6:34am

I performed this test on Windows 10 with "File input plugin". Are there any chances that there can be any problems with EOL characters or with File opening or closing?

Unlikely.

match=>{"message",%{NUMBER}} // This gave me _grokparefailure

Always surround strings with double quotes.

but in either case, I was unable to see fields tag in the stdout (Is the fields tag updated only if the grok parsing succeeds?)

What output plugin are you using?

sslv · June 20, 2017, 4:29pm

For debugging I am using stdout.Please see the following configurations and results:

1)
file content : 123

config:
input {
file {

path => ["C:\Users\xyz\Downloads\abc-20170523192613978.log"]

path => "C:\Users\xyz\Desktop\Demo\WriteText.txt"
start_position => "beginning"
}
}

filter {
grok {
match => { "@message" => "%{GREEDYDATA:data}"}
}
}
output {
stdout { codec => rubydebug }
}

My commandline shows following output:
{
"path" => "C:\Users\xyz\Desktop\Demo\WriteText.txt",
"@timestamp" => 2017-06-20T16:18:33.956Z,
"@version" => "1",
"host" => "ABC",
"message" => "123"
}

// no data tag.

2)
contents of input text:
123
789

config2:
grok {
match => { "@message" => "%{NUMBER:data}"}
}

output
{
"path" => "C:\Users\xyz\Desktop\Demo\WriteText.txt",
"@timestamp" => 2017-06-20T16:22:56.167Z,
"@version" => "1",
"host" => "ABC",
"message" => "789",
"tags" => [
[0] "_grokparsefailure"
]
}
// I am getting parse error for simple number input. So I am wondering If the problem is with windows .txt file and encoding or something because grok is able to parse as GREEDYDATA but not as NUMBER. and there are no field tags in both the outputs.

Upon using --debug flag I found this I donot know if it is useful or not:

_globbed_files: C:\Users\xyz\Desktop\Demo\WriteText.txt: glob is: []

_globbed_files: C:\Users\xyz\Desktop\Demo\WriteText.txt: glob is: ["C:\Users\xyz\Desktop\Demo\WriteText.txt"] because glob did not work

Topic		Replies	Views
Logstash Data parsing error - _dateparsefailure Logstash	3	2847	August 3, 2017
Logstash _grokparsefailure . Unable to find issue Logstash	16	11727	July 6, 2017
_dateparsefailure again Logstash	3	1148	June 27, 2017
Grok parsefailure Logstash	8	494	June 2, 2020
Logstash grokparsefailure - message pattern problem Logstash	3	943	February 16, 2017

Logstash _grokparsefailure error

path => ["C:\Users\xyz\Downloads\abc-20170523192613978.log"]

Related topics