I'm analysing the way a logging stack has been configured and am facing a particular issue.
We have 2 Logstash instances. In those two instances, we have two pipelines which are consuming from Kafka. See the input
section of the pipeline below:
input {
kafka {
bootstrap_servers => ["broker1:9092,broker2:9092,broker3:9092"]
topics => "topic-name"
security_protocol => "SSL"
ssl_truststore_location => "/path/to/truststore/truststore.jks"
ssl_truststore_password => "XYZ"
client_id => "client-id"
consumer_threads => "10"
}
}
The above pipeline configuration is the same for both Logstash instances.
Regarding Kafka - this particular topic has a total of 8 partitions.
Issue #1 - According to this thread we have a total of 20 consumer threads across both Logstash instances, but since only 8 partitions exist for this consumer group, that means there's a total of 12 consumer_threads
which are idle at all times between both Logstash servers. For this scenario, the correct configuration would be to have 4 consumer_threads
on each Logstash instance. Is this conclusion correct?
Issue #2 - Checking the number of workers on Logstash, for this particular pipeline I can see the following output of running curl -XGET -k http://127.0.0.1:9600/_node/pipelines?pretty
:
"pipeline-name" : {
"ephemeral_id" : "553bc152-e29f-4c45-88f7-f5b77ff5a5b2",
"hash" : "751c0c4bd2060bb70793328544279246418c135f3354596212ebb40247b3d5ff",
"workers" : 16,
"batch_size" : 125,
"batch_delay" : 50,
"config_reload_automatic" : false,
"config_reload_interval" : 3000000000,
"dead_letter_queue_enabled" : false
}
Considering we have 16 workers
on each Logstash instance, does that directly correlate with the consume_threads
set on the pipeline.conf
, meaning each instance would have 16 workers * 10 consumer threads = 160 consumer threads on each Logstash instance?
For context: the end-goal here is to improve the Consumer lag which currently is extremely high and is leading to a 3-5 hour delay in logs showing up on OpenSearch where Logstash is currently sending the logs to. Our idea was to increase the number of partitions for the Kafka topics with a higher number of messages so that we can also directly increase the Logstash consumer_threads
consuming the topics (again, according the shared thread above these consumer threads have a 1:1 ration with partitions on Kafka) and hopefully decrease the lag.
Thank you