Logstash _grokparsefailure error

Hi,
I am trying to parse a simple log file to understand how logstash works.

This is my log format:
2017-06-14 11:17:48 [ad8880] INFO: blah blah blah

And I have built the following grok regex using grok Constructor
%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}

But still I am getting _grokparsefailure as below:
"message":"2017-06-14 11:17:48 [ad8880] INFO: blah blah blah\r","tags":["_grokparsefailure"]}

I tried changing the date format to :
%{YEAR}-%{MONTHNUM}-%{MONTHDAY}%{SPACE}%{TIME}

but still no luck.

here is my config file:

input {
file {
path => "xxxxxx.txt"
start_position => "beginning"
}
}

filter {
grok {
match => { "Message" => "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"}
}
}
output {
file{
path => "xxx.txt"
}
}

Hi,

when i just look at

match => { "Message" => "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}[%{WORD:threadID}]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"}

i think something is not good.

Try to escape [ ] with \[, and \] you probably get more luch :slight_smile:

Hey! Thanks for the reply! Still no luck! there are characters like "" and " ' " in my GREEDYDATA.IS that a problem?

And also I am expecting output like
logtimestamp: 2017-6-.....
threadID:xyz
loglevel:INFO

etc., will I see that after successfully parsing log without error? or I need to make some changes to config file?

no, don't think so

try

http://grokconstructor.appspot.com/do/construction

or

https://grokdebug.herokuapp.com/

are quite good place to get your custom pattern work.

I tried them.. thats how I got the regex.Is there something to do with \r?

are u on windows or linux ?
Windows uses CRLF (\r\n, 0D 0A) line endings while Unix just uses LF (\n, 0A).
just \r is nothing valid by default

If u get something special just set delimiter
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-delimiter

Build your grok expression gradually and pay attention when things stop working. Start with %{TIMESTAMP_ISO8601:logsimestamp}. Does that work? Then continue with the next (%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}\[%{WORD:threadID}\]).

I am on Windows. @pts0

and @magnusbaeck I tried that.. nothing seems to work

did you got more that one line ?
try to convert to unix newline ?

It works if I use this expression
(?(.|\r)*)

and when I try to build on to that, it stops working

And can you please tell me what does conversion to unix newline mean?

are u really sure you just have \r and not \r\n , just \r is really not windows

Yes! I am on windows.

May be since the default delimiter is "\n" (as I didnot set any explicitly),\n is getting chopped off before parsing, hence may be only \r is seen. Just my thought, you should be knowing better.I just started using logstash.

then set delimiter to \r\n, should work.
And I m just a user, no expert :slight_smile:

Thanks for the reply. Tried that still no luck!

I am able to overcome \r error. But still not able to get the pattern working.

Works fine here:

$ cat data
2017-06-14 11:17:48 [ad8880] INFO: blah blah blah
$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => [
      "message",
      "%{TIMESTAMP_ISO8601:logsimestamp}%{SPACE}\[%{WORD:threadID}\]%{SPACE}%{LOGLEVEL:loglevel}:%{SPACE}%{GREEDYDATA:task}"
    ]
  }
}
$ /opt/logstash/bin/logstash -f test.config < data
Settings: Default pipeline workers: 8
Pipeline main started
{
         "message" => "2017-06-14 11:17:48 [ad8880] INFO: blah blah blah",
        "@version" => "1",
      "@timestamp" => "2017-06-20T05:36:58.681Z",
            "host" => "lnxolofon",
    "logsimestamp" => "2017-06-14 11:17:48",
        "threadID" => "ad8880",
        "loglevel" => "INFO",
            "task" => "blah blah blah"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

Thanks for the reply Magnus!
I have a few thoughts why the result might be different:
I performed this test on Windows 10 with "File input plugin". Are there any chances that there can be any problems with EOL characters or with File opening or closing?

I also performed a small test to analyse the issue.I decided to go step by step upon your advice, So I wanted to see if the setup was correct,
My input file contained:
123
456
789

If the grok pattern is
{
match=>{"message",%{NUMBER}} // This gave me _grokparefailure
}

but the pattern
{
match=>{"message",(?[0-9]*)} // did not give me any error
}

but in either case, I was unable to see fields tag in the stdout (Is the fields tag updated only if the grok parsing succeeds?)

I performed this test on Windows 10 with "File input plugin". Are there any chances that there can be any problems with EOL characters or with File opening or closing?

Unlikely.

match=>{"message",%{NUMBER}} // This gave me _grokparefailure

Always surround strings with double quotes.

but in either case, I was unable to see fields tag in the stdout (Is the fields tag updated only if the grok parsing succeeds?)

What output plugin are you using?

For debugging I am using stdout.Please see the following configurations and results:

1)
file content : 123

config:
input {
file {

path => ["C:\Users\xyz\Downloads\abc-20170523192613978.log"]

path => "C:\Users\xyz\Desktop\Demo\WriteText.txt"
start_position => "beginning"
}
}

filter {
grok {
match => { "@message" => "%{GREEDYDATA:data}"}
}
}
output {
stdout { codec => rubydebug }
}

My commandline shows following output:
{
"path" => "C:\Users\xyz\Desktop\Demo\WriteText.txt",
"@timestamp" => 2017-06-20T16:18:33.956Z,
"@version" => "1",
"host" => "ABC",
"message" => "123"
}

// no data tag.

2)
contents of input text:
123
789

config2:
grok {
match => { "@message" => "%{NUMBER:data}"}
}

output
{
"path" => "C:\Users\xyz\Desktop\Demo\WriteText.txt",
"@timestamp" => 2017-06-20T16:22:56.167Z,
"@version" => "1",
"host" => "ABC",
"message" => "789",
"tags" => [
[0] "_grokparsefailure"
]
}
// I am getting parse error for simple number input. So I am wondering If the problem is with windows .txt file and encoding or something because grok is able to parse as GREEDYDATA but not as NUMBER. and there are no field tags in both the outputs.

Upon using --debug flag I found this I donot know if it is useful or not:

_globbed_files: C:\Users\xyz\Desktop\Demo\WriteText.txt: glob is: []

_globbed_files: C:\Users\xyz\Desktop\Demo\WriteText.txt: glob is: ["C:\Users\xyz\Desktop\Demo\WriteText.txt"] because glob did not work