Horizontal scaling doesn't improve logshash performance with kinesis input plugin

as per prometheus metrics, only 1 logstash pod is processing the kinesis input, we need to vertically scale the pods in this case.

can multiple logstash not work in parallel with kinesis input?

@deepak_deore - what is the shard count of the Kinesis data stream from which your logstash pods are consuming the messages?

there is only 1 shard

is it 1 shard == 1 logstash calculation?

Yes, that is how the Kinesis Client Library, which the Logstash Kinesis Input Plugin uses, seems to work. Please see this and give resharding (increase the shard count) an attempt. - https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-scaling.html

Please keep in mind that increasing the shard count has cost implications - https://aws.amazon.com/kinesis/data-streams/pricing/

thanks @Rahul_Kumar4, it worked

btw... logstash uses application_name to save the state in dynamodb, do you know if we run 2 logstash with 1 kinesis shard but set different application_name that way both logstash will work independently on a single kinesis shard

That would not work. Having two different application_names would mean two separate dynamo_db tables would be created to track the checkpointing both independent of each other, so unless you find a way to share the states of these two different tables between them, the two instances would not know how far have the records have been processed in that single shard. You may actually end up reprocessing the records twice.

You could do with just a single Logstash instance but if you are looking for scaling, you would have to increase the shard count.

These lines in that link have more context
Typically, when you use the KCL, you should ensure that the number of instances does not exceed the number of shards (except for failure standby purposes). Each shard is processed by exactly one KCL worker and has exactly one corresponding record processor, so you never need multiple instances to process one shard. However, one worker can process any number of shards, so it's fine if the number of shards exceeds the number of instances.

thanks for the info, clear now

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.