Horizontal scaling doesn't improve logshash performance with kinesis input plugin

deepak_deore · March 30, 2020, 12:37pm

as per prometheus metrics, only 1 logstash pod is processing the kinesis input, we need to vertically scale the pods in this case.

can multiple logstash not work in parallel with kinesis input?

Rahul_Kumar4 · March 30, 2020, 7:01pm

@deepak_deore - what is the shard count of the Kinesis data stream from which your logstash pods are consuming the messages?

deepak_deore · March 30, 2020, 8:03pm

there is only 1 shard

is it 1 shard == 1 logstash calculation?

Rahul_Kumar4 · March 30, 2020, 8:31pm

Yes, that is how the Kinesis Client Library, which the Logstash Kinesis Input Plugin uses, seems to work. Please see this and give resharding (increase the shard count) an attempt. - https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-scaling.html

Please keep in mind that increasing the shard count has cost implications - https://aws.amazon.com/kinesis/data-streams/pricing/

deepak_deore · March 31, 2020, 11:11am

thanks @Rahul_Kumar4, it worked

btw... logstash uses application_name to save the state in dynamodb, do you know if we run 2 logstash with 1 kinesis shard but set different application_name that way both logstash will work independently on a single kinesis shard

Rahul_Kumar4 · April 1, 2020, 11:36am

That would not work. Having two different application_names would mean two separate dynamo_db tables would be created to track the checkpointing both independent of each other, so unless you find a way to share the states of these two different tables between them, the two instances would not know how far have the records have been processed in that single shard. You may actually end up reprocessing the records twice.

You could do with just a single Logstash instance but if you are looking for scaling, you would have to increase the shard count.

These lines in that link have more context
Typically, when you use the KCL, you should ensure that the number of instances does not exceed the number of shards (except for failure standby purposes). Each shard is processed by exactly one KCL worker and has exactly one corresponding record processor, so you never need multiple instances to process one shard. However, one worker can process any number of shards, so it's fine if the number of shards exceeds the number of instances.

deepak_deore · April 1, 2020, 11:42am

thanks for the info, clear now

system · April 29, 2020, 11:42am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash logging behind in consuming data from kinesis stream Logstash	1	289	April 15, 2020
Need to load balance Logstash Logstash	9	2046	July 11, 2021
Scaling logstash nodes Logstash	7	913	January 27, 2021
Logstash-input-kinesis plugin Logstash	3	332	August 15, 2018
Question regarding Logstash Horizontal Scaling Logstash	5	2432	June 3, 2019

Horizontal scaling doesn't improve logshash performance with kinesis input plugin

Related topics