Appending app field to coreos logs

I have logs coming in from coreos nodes via filebeat, the "source" field is of the general form: /var/log/containers/<name of container>-<some alphanumeric string I don't care about>

I'd like to parse the part after /var/log/containers/, and have found that the regex ^(\w+\b-\w+\b) should pull out the part I need. I have put this into a file in the pattern_dir.

So my grok filter currently looks like this:

filter {
  grok {
    match => { "source" => "%{GREEDYDATA}/%{CONTAINERAPP:app}*" }
    patterns_dir => ["/patterns"]
  }
}

The patterns directory has the a file "containerlogs" containing:

CONTAINERAPP ^(\w+\b-\w+\b)

...but I currently don't even see an "app" field. According to logstash's output, it seems happy with this pattern.

Can anybody point me in the right direction?

Don't include ^ in the grok pattern. Do the resulting events have a _grokparsefailure tag?

Thank you so much! That worked a treat!

Now, I've noticed that I have to match a different regex in the <name of container> part of the string, but I guess that's beyond the scope of this thread :slight_smile:

Actually, one further question - why does this work? We only want the alphanumeric words at the beginning, so why doesn't it match everything like the various regex checkers I've found online say it should?

OK, after some playing around, I need more help :blush:

I changed my regex to this:

CONTAINERAPP .+?(?=-\d) 

As I need to match everything up to some digits. I now don't even get an "app" tag being sent to logstash..

Any chance of a bit more help?

Off the top of my head I'm not sure what's going on. Your positive lookahead looks correct.

Thanks for looking. Do you think this may be a bug? Should I create an issue on github? Should I take more debugging steps first?

Thanks

Jerry

Statistically very few grok problems are caused by bugs so I'd debug this further.

I have looked at this again, and grok still seem s not to work with this particular regex. with others it works, but of course it does not match the pattern I want it to.

I have looked at Grok debugger, plugging in the relevant values, and it gives me the output:

{
  "GREEDYDATA": [
    [
      "/var/log/containers"
    ]
  ],
  "container": [
    [
      "my-really-even-more-pukka-container"
    ]
  ]
}

...but the field still does not appear in the output. Could it be a bug?

If you need to match everything up to the first digit, why not use a basic not digit pattern (\D+) ?

Thanks @Christian_Dahlqvist, almost there - I end up with:

{
  "GREEDYDATA": [
    [
      "/var/log/containers"
    ]
  ],
  "container": [
    [
      "my-really-even-more-pukka-container-"
    ]
  ]
}

(note the trailing hyphen). Even better, it works when I test in logstash. That's what's confusing me here: why does the previous one work in the grok debugger, but fail when I try it in logstash?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.