Increse performance of Logstash with multiple input

Zeeshan_Alam · July 20, 2017, 10:25am

I am using Logstash to read from Kafka. My VM is having 6 processors.

I looked at following two config:

pipeline.workers: Default is Number of the host’s CPU cores

The number of workers that will, in parallel, execute the filter and output stages of the pipeline. If you find that events are backing up, or that the CPU is not saturated, consider increasing this number to better utilize machine processing power

pipeline.output.workers: Default is 1

The number of workers to use per output plugin instance.

Since each kafka input will be processed in a single thread, to increase parallelism should I split it into multiple kafka inputs and change pipeline.output.workers: 6

Is this a good approach to maximize the usage of my VM?

input {
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
            group_id => "kafka-test101"
  }
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
             group_id => "kafka-test101"
  }
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
             group_id => "kafka-test101"
  }
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
            group_id => "kafka-test101"
  }
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
             group_id => "kafka-test101"
  }
  kafka {
            bootstrap_servers=>"kfk1:9092,kfk2:9092"
            topics => ["MyTopic"]
             group_id => "kafka-test101"
  }
}
output{
	elasticsearch {
			hosts => ["host1,host2,host3"]
			index => "logstash-myindex-%{+YYYY.MM.dd}-1"
		}
}

Christian_Dahlqvist · July 20, 2017, 12:00pm

How did you establish that the Kafka input is the limiting factor in your pipeline? Have you looked at tweaking the config of the input plugin, e.g. through the number of consumer threads?

system · August 17, 2017, 12:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is logstash input stage multithreaded? Logstash	1	892	March 21, 2017
Relationship between pipleline workers and Kafka/ES plugin workers Logstash	2	1288	July 6, 2017
Logstash pooling data from kafka much faster with multiple processes Logstash	3	630	July 6, 2017
Multiple logstash reading from a single kafka topic Logstash	10	17543	July 6, 2017
Kafka Input Performance Problems Logstash	1	499	May 15, 2020

Increse performance of Logstash with multiple input

Related topics