Groking syslog, seem to be getting a match but still getting _grokparsefailure

I am trying to parse and store a lot of syslog data, currently RHEL syslog + log4j output from Spring apps (Constraint is it has to come via rsyslog for now).

My input is :

input {
udp {
port => 514
type => "syslog"
}
}

and my grok filter is :

http://pastebin.com/gCEzbnMQ

In my logstash.stdout, AFAIK I am matching OK as my "match1 greedy" tag has been added, yet I still get lots of parsefailures :expressionless:

http://pastebin.com/8yucwwgU

Correct me if I am wrong but I am expecting, if I get the right matches in this grok, I shouldn't be seeing any parsefailures or anything at all in logstash.stdout ?

I'm using logstash-1.4.2-1_2c0f5a1.noarch on RHEL 6.6

Hey I just threw your code into https://grokdebug.herokuapp.com/ and it seems to be parsing correctly. So your pattern is correct. Which means its something else in your config. I have never used the command:

     tag_on_failure => [ '' ] 

And I don't think you'll need it since you left the field blank in your config. Plus these extra commands:

    break_on_match => 'true'
    drop_if_match => 'true'

Are they really necessary since you only have 1 match case? So can you try to remove these 3 lines and see if the grokparsefailure still occurs?

Hi, have removed those and re-tried, still getting grokparse I'm afraid :frowning:

http://pastebin.com/XmPEQryP

http://pastebin.com/e4rksP6f

This is just so strange its parsing your input correctly...like 100% you can see that even the logmessage is your greedydata. I am stumped do not know what to say. My last advice is bring your config to very minimal bare bone design and work from there. So instead of getting input from some port use a standard config file or use stdin. Then for your output use stdout with a ruby codec. If there is still a problem post your code and I'll try it on my VM. Here is one last thing to do to your grok match pattern I would change just nice and simple config:

input { stdin { } }
filter {
grok {
match => [ 'message', '<%{POSINT:pri}>%{SYSLOGTIMESTAMP:timestamp} %{IPORHOST:hostname} %{WORD:app_name} %{WORD:level} %{GREEDYDATA:message}' ]
}
}
output {
stdout { codec => rubydebug }
}

Interestingly, using the stdin barebones config, I don't get a grokparse... so presumably something to do with the input part? Will experiment.

That is good to hear. But it would be strange if the input was the issue here since its saying grokparsefailure. Which makes me think it was something with your message. Notice that I changed your greedydata name to just message! Usually after that line I overwrite the original message with what is read from greedydata. Keep at it! If I were you I would keep trying to make this barebone config to what you wanted it to look like.

Next weird fact - with this config, on the messages I expect to not match, I get the [0] "not a _grokparsefailure_honest", but the ones I expect not to match do not get my custom failure tag but I still get "_grokparsefailure" in there! argh

http://pastebin.com/ikCT9h5C

Both logic seems to give you a grok mismatch or there is just so many negatives in your sentence. Elaborate some more and paste your log you are trying to grok. When you use the

tag_on_failure => [ "not a _grokparsefailure_honest" ]

You should expect to find this tag when it means it will place a tag where ever your grok tests fails. Which I think you are expecting. If you want to remove the _grokparsefailure I believe you can use the remove tag command.