But I'm a little confused about .output.events.total, for me it is always increasing, which implies that my output cannot ingest the logs quick enough. It is tracking exactly the same as .output.events.acked however, which contradicts the description of this metric. Could someone please clarify? My hunch (hope) is that the description of .output.events.total is actually incorrect and it is indeed the total number of output events recorded.
That is not correct... .output.events.total is not the running sum total of events since the process started... it is the total number of events over the 30s
If both total and acked are increasing together, that would imply that your throughput is increasing... how long does that go on?. If you have many files open filebeat can take a little time to get up to full speed.
Thanks for the response, yeah by "cluster" I mean it's a horizontally scaled cluster. Multiple (20 at the moment) filebeat containers running behind a load balancer. All listening on UDP. Version of filebeat is the latest (8.15.3). I use this as a log aggregator to capture logs from our services then send it on to logstash. This used to be a logstash cluster, but filebeat is a little lighter, and I can utilise the compression_level when sending on.
For a bit more context - I checked this last week - over a 1.5 hour period from a fresh start, the .output.events.total (the sum from all containers) was 28.5m. I checked to see what was ultimately ingested into Elasticsearch for that same period and it matched. I can't imagine it's possible that the service was processing 28.5m logs in 30s period, it must've been the total since startup. Possibly the this due to the way I'm capturing the metrics? Here is a section of the config:
I also have another filebeat "cluster" that reads logs from S3, I see the same behaviour with the .output.events.total (capturing the metrics the same way).
Yup, you're completely right. I guess I assumed it was the same metrics data.
I don't think I can use internal collection as I don't have direct access to Elasticsearch in this scenario, only to Logstash. It doesn't appear to have a Logstash output available. That being said, I'm happy with the HTTP stats endpoint, now that I know the full details. Thanks for your help
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.