Strange behaviour of metrics.count

I'm seeing intermittent drops of the count value in the metrics plugin that I cant explain.

My setup is that I have an java application, using the logstash-logback-encoder, configured to emit only ERROR log messages and connecting via tcp to logstash which uses the metrics plugin to count the number of events, and output that to carbon-cache from the graphite package.

The goal of my exercise is to get statistics of the number of error log events per day, and somehow visualize when they occur.

Why do I see intermittent "dips" in the metrics.count value several times per day? The logstash process is NOT restarted during the time period.

I'm using the following versions:
logstash-filter-metrics (4.0.2), logstash 5.1.1, python-carbon-0.9.12-3.el6.1.noarch, graphite-web-0.9.12-5.el6.noarch

Relevant parts of my logstash configuration is:
input {
tcp {
port => 4560
codec => json_lines
tags => ["logback-tcp"]
}
}

filter {
if "logback-tcp" in [tags] and [level] == "ERROR" {
metrics {
meter => ["servers.%{[service]}.error_log"]
add_tag => ["ERROR-logback-tcp"]
flush_interval => 60
}
}
}

if "ERROR-logback-tcp" in [tags] {
graphite {
host => "{{ inventory_hostname }}"
port => 2003
include_metrics => ["servers.*error_log"]
fields_are_metrics => true
}
}
}

Regards
/Pär

The strange thing is that the rate from the same metrics looks perfectly normal

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.