Weird behavior with regexp conditionals


#1

Hi All,

I am getting a weird behavior with regexp conditionals.

I am using Logstash 1.5.4. The logs I am parsing should have the following format under normal conditions: Tue Sep 22 18:34:16 UTC 2015: Log Message

I want to mark all logs that start with that date pattern as INFO and logs that start with some other string as ERROR. In order to do this I am using a regexp conditional to determine if the line starts with that date pattern.

The problem is that this seems to work 99.9% of the times, but every once in a while I get a false positive. The regexp is failing to match a log messages that has the correct pattern. I've re-processed these messages using the same filter and it successfully matches the pattern the second time around.

This is the filter I am using:

filter
{
    if [type] == "mylog"
    {
        if [message] =~ /^[a-zA-Z]+\s[a-zA-Z]+\s+[0-9]+\s[0-9]+:[0-9]+:[0-9]+\s[a-zA-Z]+\s[0-9]+:/
        {
            #parse log message
            mutate
            {
                add_field => { "severity" => "INFO" }
            }
        }else
        {
            mutate
            {
                add_field => { "severity" => "ERROR" }
            }
        }
    }
}

Have you guys run into a similar issue? Do you have any suggestions? What would be a good workaround?

Thanks!


(Jay Greenberg) #2

This is disturbing because computers are usually so consistent :smile:

If you have a reproducible margin of 1% false positive, then please share some sample input with me via S3 or Dropbox, or the like, and I will attempt to reproduce it on my end.

Thanks


(system) #3