Logstash performance bottleneck

I'm using Kafka input with JSON Codec. Parsing is already done (dissect) from Filebeat.

What I'm seeing is a thread called "Ruby-0-Thread-3" using 100% CPU. Overall CPU utilization isn't much (only that thread using 100% of CPU). I have 8 workers specified.

top - 14:52:18 up 47 days, 19:05,  0 users,  load average: 3.01, 3.24, 2.97
Threads: 117 total,   2 running, 115 sleeping,   0 stopped,   0 zombie
%Cpu(s): 13.5 us,  1.6 sy,  0.0 ni, 84.8 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 19787100+total, 14533780+free, 42042980 used, 10490216 buff/cache
KiB Swap:  4095996 total,  4095996 free,        0 used. 15306265+avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                                                                                     
  130 logstash  20   0   13.6g   4.0g  23056 R 98.3  2.1  20:50.68 Ruby-0-Thread-3                                                                                                                                             
  133 logstash  20   0   13.6g   4.0g  23056 S 18.0  2.1   3:25.85 Ruby-0-Thread-3                                                                                                                                             
  115 logstash  20   0   13.6g   4.0g  23056 S 15.0  2.1   2:44.57 [cdn-edge-logp]                                                                                                                                             
  122 logstash  20   0   13.6g   4.0g  23056 S 14.7  2.1   2:43.01 [cdn-edge-logp]                                                                                                                                             
  117 logstash  20   0   13.6g   4.0g  23056 S 13.3  2.1   2:43.72 [cdn-edge-logp]                                                                                                                                             
  110 logstash  20   0   13.6g   4.0g  23056 S 13.0  2.1   2:42.45 [cdn-edge-logp]                                                                                                                                             
  113 logstash  20   0   13.6g   4.0g  23056 S 12.7  2.1   2:44.44 [cdn-edge-logp]                                                                                                                                             
  119 logstash  20   0   13.6g   4.0g  23056 S 12.7  2.1   2:43.33 [cdn-edge-logp]                                                                                                                                             
  108 logstash  20   0   13.6g   4.0g  23056 S 11.7  2.1   2:43.56 [cdn-edge-logp]                                                                                                                                             
  121 logstash  20   0   13.6g   4.0g  23056 S 11.3  2.1   2:42.97 [cdn-edge-logp]   

Somehow I can only process 18k message per seconds and it won't go any higher.

Doesn't look like the pipeline viewer is realistic either...

18k events per seconds:

image

How can I troubleshoot what is using all that CPU on that thread?

I am not sure, but, in my experience, if you use input by the network, then your bottleneck is number of threads of network sessions. One TCP session - one thread - one cpu core.

How many partitions do you use in the topic?
What is the value of "consumer_threads" you set?

Make the number of partitions by the number of cores on the logstash server.
Make "consumer_threads" by the number of partitions.
Then all processor cores will be loaded.
It works for me.

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html#plugins-inputs-kafka-consumer_threads

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.