CPU use grows to 100%, Index rate to almost 0

andreas_at_work · February 20, 2017, 10:20am

Hey,

Configuration
Elastic Cluster (3 master nodes, 2 coordination nodes, 6 data nodes) + 2 Logstash (5.2.1) nodes
Elastic Cluster status is green:
Nodes: 11
Indices: 23
Memory: 57GB / 249GB
Total Shards: 150
Unassigned Shards: 0
Documents: 3,607,973,417
Data: 5TB
Uptime: 3 days
Version: 5.2.1

Logstash filter:

http://pastebin.com/KkZ51xGk

Logstash uses the default config.

Problem
Each of the Logstash nodes starting without any problem and importing with an index rate of ~20k events/sec . There are some Errors in the Logstash logs like:

Error parsing csv [ . . . ]  :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}
Received an event that has a different character encoding than you configured. [ . . . ] :expected_charset=>"UTF-8"}

We are aware of them, but that are known problems with Bluecoat log files. 60k of 3,6 billions is "ok".

At some point the index rate drops to 300 events/sec and the cpu usage grows to 100% on all cpu cores. This happens on both Logstash Nodes, but independent from each other, e.g. on Logstash node 1 it occurs after 3 hour and on Logstash node 2 after 7 hour.

# ps -ef | grep java
root     21948 21041 99 Feb19 pts/2    12-07:39:39 /usr/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -Djava.awt.headless=true -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError -Xmx1g -Xms256m -Xss2048k -Djffi.boot.library.path=/usr/share/logstash/vendor/jruby/lib/jni -Xbootclasspath/a:/usr/share/logstash/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/usr/share/logstash/vendor/jruby -Djruby.lib=/usr/share/logstash/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main /usr/share/logstash/lib/bootstrap/environment.rb logstash/runner.rb --path.settings=/etc/logstash/

# top -Hp 21948 | head -n23
top - 11:11:25 up 27 days, 21:10,  1 user,  load average: 15.93, 15.56, 14.10
Threads:  84 total,   5 running,  79 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.2 us,  0.1 sy,  0.0 ni, 92.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 65864420 total,  2103748 free,  1371972 used, 62388700 buff/cache
KiB Swap: 15615996 total, 15615996 free,        0 used. 63891584 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                                                                                                    
22024 root      20   0 8244812 1.089g  18872 S 99.9  1.7   1039:01 [main]>worker11                                                                                                                                                            
22026 root      20   0 8244812 1.089g  18872 S 99.9  1.7   1038:29 [main]>worker13                                                                                                                                                            
22028 root      20   0 8244812 1.089g  18872 S 99.9  1.7   1039:25 [main]>worker15                                                                                                                                                            
22013 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:54 [main]>worker0                                                                                                                                                             
22014 root      20   0 8244812 1.089g  18872 R 93.8  1.7   1038:46 [main]>worker1                                                                                                                                                             
22015 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1039:30 [main]>worker2                                                                                                                                                             
22016 root      20   0 8244812 1.089g  18872 R 93.8  1.7   1039:05 [main]>worker3                                                                                                                                                             
22017 root      20   0 8244812 1.089g  18872 R 93.8  1.7   1039:03 [main]>worker4                                                                                                                                                             
22018 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:11 [main]>worker5                                                                                                                                                             
22019 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:59 [main]>worker6                                                                                                                                                             
22020 root      20   0 8244812 1.089g  18872 R 93.8  1.7   1038:29 [main]>worker7                                                                                                                                                             
22021 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:54 [main]>worker8                                                                                                                                                             
22022 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1039:03 [main]>worker9                                                                                                                                                             
22023 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:58 [main]>worker10                                                                                                                                                            
22025 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1038:42 [main]>worker12                                                                                                                                                            
22027 root      20   0 8244812 1.089g  18872 S 93.8  1.7   1039:16 [main]>worker14

Any ideas how i might find if there is problem within the filter or any other ideas? There are no errors/warnings in the ES or Logstash logs at the point the problem occurs.

Thanks
Andreas

warkolm · February 20, 2017, 9:41pm

Do you have the Monitoring plugin installed? If not, definitely start there.

system · March 20, 2017, 9:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
100% CPU with Logstash 1.5.x Logstash	5	1238	July 6, 2017
Why is my Logstash using huge CPU and how to fix it? Logstash	6	2405	June 12, 2017
Logstash performance drop with high CPU usage Logstash	2	942	July 6, 2017
High cpu usage by worker thread Logstash	8	4059	July 6, 2017
LogStash Stops after a few hours Logstash	18	6328	July 6, 2017

CPU use grows to 100%, Index rate to almost 0

Related topics