Understanding metrics filters

Hi I am having trouble understanding the proper way to use metrics filter plugin to count the number of fields of a particular type. Say I have a file formatted with a name and age separated by a space e.g.:

dustin 40
emily 35
joe 40

I am trying to use a metric filter like the following:

filter {
  grok {
    match => { "message" => "%{WORD:name} %{NUMBER:age}" }
  }

  metrics {
    meter => "age_%{age}"
    add_tag => ["meter", "%{age}"]
    add_field => { "age" => "%{age}" }
  }
}

But in adding the tags I only get a literal "%{age}" string in the event instead of the value of the age field from the log e.g. 40, 35, etc.

{
           "age" => "%{age}",
      "@version" => "1",
        "age_35" => {
           "count" => 1,
         "rate_1m" => 0.18400888292586468,
         "rate_5m" => 0.19669429076432351,
        "rate_15m" => 0.19889196960097935
    },
          "tags" => [
        [0] "meter",
        [1] "%{age}"
    ],
       "message" => "hostname",
    "@timestamp" => 2023-01-06T23:45:04.733452Z
}

What am I doing wrong here?

On a related note I am also having trouble outputting the counter values of different ages. For example I'd like the ultimate output to be the count of each age value:

age_40: 2
age_35: 1

I have tried an output statement like:

output {
  if "meter" in [tags] {
    stdout {
      codec => line { format => "age_39: %{[age_39][count]}" }
    }
  }
}

which works okay but is hard-coded

There doesn't seem to be any way to interpolate the age into output string so that it I get my desired output; the aggregated count per age. Is there a better way to go about this? should I be using an aggregate filter instead?

From the documentation: Metrics appear as new events in the event stream and go through any filters that occur after as well as outputs.

So you are creating metrics for events that have an [age] field, but the additional events that the metrics filter creates will not have an [age] field, so your add_field and add_tag do not work.

Note that it does not make sense to have an [age] field on the metrics event, since each event contains data for every age

{
    "age_40" => {
     "rate_1m" => 0.10543885524629079,
    "rate_15m" => 0.3659788914920122,
     "rate_5m" => 0.30637133534585953,
       "count" => 2
},
    "age_35" => {
     "rate_1m" => 0.052719427623145396,
    "rate_15m" => 0.1829894457460061,
     "rate_5m" => 0.15318566767292977,
       "count" => 1
},
"@timestamp" => 2023-01-07T02:18:48.998068074Z,
...

Thanks Badger. That makes sense. I did kinda suspect the reason I couldn't use age again was because the metrics were separate events from the initial logs they are based on and didn't have an age field, but I was missing the important detail that all of the data for each age is output as a single event. Still having trouble understanding how to get the output I want where I have a line for each age and the associated count output, but I am going to read more and take another stab at it.

Do you want to retain the rate_ data? Or would you be happy transforming that to

{
  "age_40" => 2,
  "age_35" => 1,
  ...

I would be happy transforming it to just the counts. The plan is output this in prometheus format and then use prometheus to scrape it and use promql to generate rates based on the counts. On further investigation it seems what I want to do is somehow split this one event into multiple events. I found some info stackoverflow on doing that with the ruby filter: configuration - Logstash dynamically split events - Stack Overflow but it seems the event api only lets me get and set fields. Problem is the way I have the keys where each contains the age in the key itself. This means I don't know the names of the keys in the event adhead of time, so I'm not sure how to do this yet. If I could somehow get all the keys in the even that match a regex like 'age_\d+' or something like that then I could probably figure it out. Any tips are appreciated.

A ruby filter like

    grok { match => { "message" => "%{WORD:name} %{NUMBER:age}" } }
    metrics { meter => "age_%{age}" add_tag => [ "meter" ] }
    if "meter" in [tags] {
        ruby {
            code => '
                event.to_hash.each { |k, v|
                    if k =~ /age_\d+/
                        event.set(k, v["count"])
                    end
                }
            '
        }
    }

will get you

 {
           "tags" => [
         [0] "meter"
     ],
         "age_35" => 1,
         "age_40" => 2,
          ....

If you want to split to an event per age then you could start with

    if "meter" in [tags] {
        ruby {
            code => '
                a = []
                event.to_hash.each { |k, v|
                    if k =~ /age_\d+/
                        a << { "age" => k.sub("age_", ""), "count" => v["count"] }
                        event.remove(k)
                    end
                }
                event.set("stuff", a)
            '
        }
        split { field => "stuff" }

That gives you events with predictable field names: [stuff][age] and [stuff][count]

If you have further questions I will reply in the morning.

1 Like

Wow, thanks Badger. This is exactly what I needed. I had no idea I could do to_hash on the event object, but should have realized that since it's ruby. This works perfectly when I echo new lines into my input file but I do see a strange behavior of lines of a particular age not getting counted if I start logstash pointed at an input file that already has some lines in it. It's a bit puzzling. Here's my full config, and example input, and the output I see. Fwiw, I'm using logstash-8.5.3.

input file:

dummy 55
dummy 55
dummy 55
dummy 55
dummy 55
dummy 55
dummy 77
dummy 77
dummy 88

config file:

input {
 file {
   path => ["/path/to/testdata.log"]
   sincedb_path => "/dev/null"
   start_position => "beginning"
  }
}

filter {
  grok {
    match => { "message" => "%{WORD:name} %{NUMBER:age}" }
  }

  metrics {
    meter => "age_%{age}"
    add_tag => ["meter"]
  }

  if "meter" in [tags] {
      ruby {
        code => '
            a = []
            event.to_hash.each { |k, v|
                if k =~ /age_\d+/
                    a << { "age" => k.sub("age_", ""), "count" => v["count"] }
                    event.remove(k)
                end
            }
            event.set("stuff", a)
        '
      }
      split { field => "stuff" }
  }
}

output {
  if "meter" in [tags] {
    stdout {
      codec => line { format => "age: %{[stuff][age]} count: %{[stuff][count]}" }
    }
  }
}

output:

age: 88 count: 1
age: 55 count: 4
age: 77 count: 2

The count for 88 and 77 are correct. But the count for 55 is 4 and should be 6. Any clue what could be going on with that? Do you observe similar behavior with the same input and config?

Do you have pipeline.workers set to 1 to force all events to go through the same ruby filter instance? If not you may have two or more ruby filters each counting a subset of the events.

Hi Badger, you are correct. pipeline.workers was set to the default of 12. When set to 1 I see the correct counts. Thanks again for your help. You're awesome.

Hi Badger, do you know of any way to solve this issue with incorrect counts besides setting pipeline.workers to 1? Unfortunately my deployment needs multiple workers. I've also observed incorrect counts now with no ruby filter and using graphite_exporter with "fields_are_metrics => true" set. So it doesn't seem to be isolated to ruby filters.

You could do it all in a ruby filter using a class variable so that it is shared across threads. But then you need synchronization (see example 1 in this thread). I have no idea what the performance will be like.

Okay. Thanks for the tip there. I'm now wondering if there is some kind of bug with logstash metrics filter plugin and multiple pipeline workers. Even without the ruby filter we see incorrect counts with the graphite plugin. Going to report this as a bug and see what kind of response I get. Could it also be related to using a file input? That's admittedly grasping at straws but without knowing logstash internals that's all I got at the moment. It does seem wrong for it to not produce correct counts eventually with multiple threads. If this isn't a bug then I am not sure what I am doing wrong.