We are currently using the kinesis input plugin for Logstash (https://github.com/logstash-plugins/logstash-input-kinesis) together with the basic XPack plugin, which allows us to monitor event receive and emit rate.
We launch our Logstash nodes using the pipelines.yml file, which has entries that point to the confiugration path of each pipeline.
An issue which we have encountered and is easily seen in the Monitoring section of Kibana is that when a pipeline is reloaded (either through changing the pipeline config file, or the pipelines.yml file), the event receive rate drops, as it is expected. After the pipeline reloads itself, though, we are seeing a much higher than expected event receive rate, a few times higher, in fact.
Our expectations are as follow:
- Take a 'regular' event receive rate of 6000 events/sec across 3 pipelines (each receiving roughly 2000 events/sec)
- We reload a pipeline, and the event ingestion rate drops to 4000 events/sec, as expected, since a pipeline is not active while it is reloading
- The pipeline takes 15 seconds to reload, in which events get backed up and the Kinesis stream/shard iterator age increases, as expected
- The pipeline finishes refreshing/reloading, and the total event receive rate shoots up to 10000-12000 events/second for 15-20 seconds, slowly returning to the normal rate of 6000 events/sec.
We would expect that having a throughput of 2000 events/second and the pipeline being offline for 15 seconds, we would have around 30000 events to catch up, but instead we are seeing many more.
This introduces many questions, including questioning the fact that events which are processed using the kinesis input plugin are unique and not duplicated when a pipeline reloads.
Can anybody provide any insight on this?