[SOLVED] Strange grokparsefailure behavior

Hi all,

Just to share with you a strange grokparsefailure behavior and I try to understand why...

Logstash Input:

input {
   file {
      path => "/etc/logstash/test-2016-04-17-05.log"
      start_position => beginning
      ignore_older => 0
   }
}

Logstash Filter:

filter {
       grok {
           match => { "message" => [ "%{DHCP_ACK}", "%{DHCP_OFFER}" ] }
           add_tag => [ "infoblox" ]
           match => { "path" => "%{YEAR:log_year}" }
       }
       mutate {
           add_field => { "@source_host" => "%{Infoblox_server}" }
       }
      dns {
          nameserver => "192.168.1.1"
          reverse => [ "@source_host" ]
          action => "replace"
      }
  }

Logstash Output:

output {
       file {
           path => "/etc/logstash/test_infoblox_output.txt"
       }
       stdout {
       }
  }

When my input file contains just one line ("2016-04-17T05:35:55+02:00 192.168.1.20 info Added new forward map from dhcp-192.168.2.3.test.corp to 192.168.2.4"), I get the next output:

_{"message":"2016-04-17T05:35:55+02:00 192.168.1.20 info Added new forward map from dhcp-192.168.2.3.test.corp to 192.168.2.4","@version":"1","@timestamp":"2016-04-26T08:38:39.254Z","path":"/etc/logstash/test-2016-04-17-05.log","host":"log1","tags":["grokparsefailure"],"@source_host":"%{Infoblox_server}"}

So for me, it is the right result regarding my patterns (patterns tested with grokdebugger).

But when my input file contains 3 Million lines, I get the next output:

{"message":"2016-04-17T05:35:55+02:00 192.168.1.20 info Added new forward map from dhcp-192.168.2.3.test.corp to 192.168.1.20","@version":"1","@timestamp":"2016-04-25T15:23:39.780Z","path":"/etc/logstash/test-2016-04-17-05.log","host":"log1","log_year":"2016","tags":["infoblox"],"@source_host":"%{Infoblox_server}"}

I don't understand why I don't get "_grojparsefailure" as expected ?

If you have any idea !

Thanks in advance,
Alexandre

I made some more tests, and now I don't have _grokparsefailure anymore, even when "test" string that match nothing in my patterns.

I can't understand why...

ok, I have some news.

After severals tests I found why I get this strange behavior.

It depends on my file name.

If I put "test.log" name for my input file, I get the expected behavior. But if I use "infoblox-2016-04-17-05_12.log" name for my input file, I don't get the behavior as expected.

So I'm still trying to understand why, if someone as an idea, please let me know !

Logstash take considaration about the file name ?

Thanks,
Alex

I understand my mistake.

When you use "match => { "path" => "%{YEAR:log_year}" }" inside a grok filter, a field "path" exists (coming from the input plugin).

So the gork filter match this field and adds the field "log_year" as expected.

I didn't know but the input plugin also add fields:

  • host
  • @version
  • @timestamp

Have a good day !
Alex