Logstash pipeline metric events not output under load

I have a pipeline configured with logstash 6.3.0 which should capture metrics using the metrics filter for the number of events then output them to statsd. My pipeline config is as follows (i've removed some bits of company specific config):

input {
  tcp {
    id => "tcp_input"
    port => 3515
    codec => 'json_lines'
    tags => ["tcp"]
  }
  http {
    id => "http_input"
    port => 8080
    tags => ["http"]
  }
  heartbeat {
    id => "heartbeat_input"
    add_field ...
    ...
    interval => 10
  }
}
filter {
  if "_jsonparsefailure" in [tags] { drop { } }

  if conditional 1 { drop { } }

  if conditional 2 { drop { } }

  if conditional 3 { drop { } }

  truncate {
    id => "truncate_large_message_field"
    fields => [ "Message", "message"]
    length_bytes => 10240
  }

  mutate {
    id => "main_mutate_filter"
    rename => ...
    ...
    rename => ...
    add_field => ...
    add_field => ...
  }

  metrics {
    id => "logstash_metrics"
    meter => [ "events" ]
    add_tag => "metric"
    flush_interval => 10
    rates => [1]
  }
}
output {
  if "metric" not in [tags] {
    tcp {
      host => "<%= logstash_address %>"
      port => 3515
      codec => "json_lines"
    }
  }

  if "metric" in [tags] {
    statsd {
      id => "statsd_output"
      gauge => { "events.rate_1m" => "%{[events][rate_1m]}" }
      count => { "events.count" => "%{[events][count]}" }
      port => "8125"
      host => "<%= monitoring_address %>"
      namespace => "<%= tenant %>"
      sender => "<%= feature %>_<%= position %>"
    }
  }
}

Without any load I see the metrics in our monitoring system, but as soon as I start pushing any load at logstash I don't see the metrics.

I have log.level: debug set in logstash.yml and I can see that my statsd output is not being used under load.

I have 3 instances behind an ELB on AWS, currently c5.large, but I'm testing with other instance types.

Any help would be much appreciated.

Thanks
Andy

For reference, this is what we're seeing for our received metrics:

I've figured this out, it's because we have a high batch size (pipeline.batch.size: 3000). I guess what's happening is that under low load there's no batching of the received events then as load increases, the events start to get batched, but there are insufficient metrics events to be batched and sent. I've tested reducing the batch size and the metrics come through as expected.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.