Logstash 6.2.4 Slow Performance Kinesis Input when using json codec

amarsh · May 16, 2018, 5:14pm

I'm seeing significant performance impacts when using codec => "json" with the Kinesis input plugin and wanted to see if anyone has recommendations or suggestions for speeding up the processing.

On a 4 CPU 16 GB box running RedHat with the xmx and xms set to 8 GB, the following config is able to process about 50,000+ per minute 4 workers and a batch size of 10000 into a 8 node ElasticSearch cluster. This is without the code setting.

input {
        kinesis {
                kinesis_stream_name => "kinesis-stream"
                application_name => "logstash-kinesis-poc"
                region => "us-east-1"
                profile => "default"
         }
}
output {
        stdout { codec => rubydebug }
        elasticsearch {
                hosts => ["http://elastic_box_one:9200","http://elastic_box_two:9200"]
                doc_as_upsert => true
                template_overwrite => false
                index => "kinesis-poc-%{+YYYY-MM-dd}"
                template_name => "poc"
        }
}

When the config has type and codec added the performance drops to 2100+ per minute at 4 workers and batch of 10,000.

input {
        kinesis {
                kinesis_stream_name => "kinesis-stream"
                application_name => "logstash-kinesis-poc"
                region => "us-east-1"
                profile => "default"
                type => "message"
                codec => "json"
         }
}
output {
        stdout { codec => rubydebug }
        elasticsearch {
                hosts => ["http://elastic_box_one:9200","http://elastic_box_two:9200"]
                doc_as_upsert => true
                template_overwrite => false
                index => "kinesis-poc-%{+YYYY-MM-dd}"
                template_name => "poc"
        }
}

Performance increases to about 5000 per minutes when the workers are left at 4 and the batch is reduced to 100.

We also tried using filter to set the json source, but that had the same level of performance decrease as using codec => "json" in the input.

filter {
         json {
             source => "message"
         }
 }

system · June 13, 2018, 5:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash JSON codec vs JSON filter performance Logstash	1	634	February 28, 2020
Which codec has the best performance? Logstash	5	1812	July 6, 2017
Codec json is not working Logstash	7	1455	March 2, 2017
Default codecs are different for 3.0 Kafka plugins Logstash	4	902	July 6, 2017
Prevent logstash from adding fields to json input Logstash	3	459	March 4, 2019

Logstash 6.2.4 Slow Performance Kinesis Input when using json codec

Related topics