Why my Logstash is consuming more than 80% of CPU?

elasticheart · March 7, 2017, 10:58am

Hi,

I use ELK GA 5.0.0. I have 2 Logstash instances in my Linux box. Both of them are designed to consume from Kafka. One logstash consume from a Kafka topic while the other Logstash consume from 3 other Kafka topics. I have designed Logstash instances like this because the first topic has more data compared to other 3 topics. Using the top command in Linux, I have noticed that my Logstash is taking a lot of CPU. I then tried Logstash's hot thread API and got the output like below;

{
  "host" : "my-linux-box",
  "version" : "5.0.0",
  "http_address" : "192.168.0.1:9600",
  "hot_threads" : {
    "time" : "2017-03-07T10:22:03+00:00",
    "busiest_threads" : 3,
    "threads" : [ {
      "name" : "Ruby-0-Thread-11",
      "percent_of_cpu_time" : 80.56,
      "state" : "waiting",
      "path" : "/apps/ELK/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-kafka-5.0.5/lib/logstash/inputs/kafka.rb:191",
      "traces" : [ "sun.misc.Unsafe.park(Native Method)", "java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)", "java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)", "java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)", "java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)", "java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)", "java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)", "org.jruby.javasupport.util.ObjectProxyCache$Segment.getOrCreate(ObjectProxyCache.java:449)", "org.jruby.javasupport.util.ObjectProxyCache.getOrCreate(ObjectProxyCache.java:164)", "org.jruby.javasupport.Java.getInstance(Java.java:385)" ]
    }, {
      "name" : "Ruby-0-Thread-12",
      "percent_of_cpu_time" : 80.27,
      "state" : "runnable",
      "path" : "/apps/ELK/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-kafka-5.0.5/lib/logstash/inputs/kafka.rb:191",
      "traces" : [ "java.lang.System.identityHashCode(Native Method)", "org.jruby.javasupport.util.ObjectProxyCache.hash(ObjectProxyCache.java:174)", "org.jruby.javasupport.util.ObjectProxyCache.getOrCreate(ObjectProxyCache.java:163)", "org.jruby.javasupport.Java.getInstance(Java.java:385)", "org.jruby.javasupport.Java.getInstance(Java.java:364)", "org.jruby.javasupport.JavaUtil.convertJavaToUsableRubyObject(JavaUtil.java:194)", "org.jruby.javasupport.JavaMethod.convertReturn(JavaMethod.java:513)", "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:437)", "org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:300)", "org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:35)" ]
    }, {
      "name" : "Ruby-0-Thread-13",
      "percent_of_cpu_time" : 80.21,
      "state" : "blocked",
      "path" : "/apps/ELK/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-kafka-5.0.5/lib/logstash/inputs/kafka.rb:191",
      "traces" : [ "java.lang.ref.ReferenceQueue.poll(ReferenceQueue.java:107)", "org.jruby.javasupport.util.ObjectProxyCache$Segment.expunge(ObjectProxyCache.java:320)", "org.jruby.javasupport.util.ObjectProxyCache$Segment.getOrCreate(ObjectProxyCache.java:451)", "org.jruby.javasupport.util.ObjectProxyCache.getOrCreate(ObjectProxyCache.java:164)", "org.jruby.javasupport.Java.getInstance(Java.java:385)", "org.jruby.javasupport.Java.getInstance(Java.java:364)", "org.jruby.javasupport.JavaUtil.convertJavaToUsableRubyObject(JavaUtil.java:194)", "org.jruby.javasupport.JavaMethod.convertReturn(JavaMethod.java:513)", "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:437)", "org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:300)" ]
    } ]
  }
}

It will be really helpful if somebody tell me why this is happening and how to fix this?

Thank you.

magnusbaeck · March 7, 2017, 11:06am

How many events are being processed every second?

elasticheart · March 7, 2017, 11:19am

Hi @magnusbaeck below is my Kafka input configuration;

kafka{
	group_id => "consumergroup_a"
	topics => ["topic_a"]
	bootstrap_servers => "192.168.0.1:9091,192.168.0.3:9093,192.168.0.2:9092"
	add_field => { "logtype" => "LOGFILE"  }
}

magnusbaeck · March 7, 2017, 11:20am

Okay, but that doesn't answer my question.

elasticheart · March 7, 2017, 11:22am

@magnusbaeck sorry that I don't know what you are expecting. Events means? Is that the number of messages in Kafka topic every second?

elasticheart · March 7, 2017, 11:43am

There will be an average of 50 entries every second

elasticheart · March 7, 2017, 12:05pm

Below is my entire Logstash configuration of Logstash which consumes from 3 Kafka topic, and has higher CPU utilization;

input {
	kafka{
		group_id => "consumergroup_a"
		topics => ["topic_a"]
		bootstrap_servers => "192.168.0.1:9091,192.168.0.3:9093,192.168.0.2:9092"
		codec => json
		add_field => { "logtype" => "LOGFILE" }
	}
	kafka{
		group_id => "consumergroup_b"
		topics => ["topic_b"]
		bootstrap_servers => "192.168.0.1:9091,192.168.0.3:9093,192.168.0.2:9092"
		add_field => { "logtype" => "LOGFILE"  }
	}
	kafka{
		group_id => "consumergroup_c"
		topics => ["topic_c"]
		bootstrap_servers => "192.168.0.1:9091,192.168.0.3:9093,192.168.0.2:9092"
		add_field => { "logtype" => "logfile_c"  }
	}
}

filter {
	if[logtype] == "LOGFILE"{
		if[fields][logtype] == "logfile_a"{
			grok{
				match => { "message" => "####<(?<timestamp>%{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM) (?:GMT))\> <%{GREEDYDATA:field_a}> <%{GREEDYDATA:field_b}> <%{GREEDYDATA:skip1}> <%{GREEDYDATA:field_c}> <%{GREEDYDATA:skip2}> <%{GREEDYDATA:skip3}> <%{GREEDYDATA:skip4}> <%{GREEDYDATA:field_d}> <%{GREEDYDATA:skip5}> <%{GREEDYDATA:skip6}> <%{GREEDYDATA:field_e}>" }	
			}
			mutate{
				remove_field => ["skip1","skip2","skip3","skip4","skip5","skip6"]
			}
			date {
				match => [ "timestamp", "MMM dd, YYYY hh:mm:ss aa z" ]
				timezone => "GMT"
				target => "@timestamp"
			}
		}else if[fields][logtype] == "logfile_b"{
			grok{
				match => { "message" => "%{NOTSPACE:field_a} %{NOTSPACE:field_b} ('%{NOTSPACE:field_c}'|%{NOTSPACE:field_c}) %{NOTSPACE:field_d} %{NOTSPACE:field_e} %{NOTSPACE:field_f} %{NOTSPACE:field_g} %{NOTSPACE:field_h} %{NOTSPACE:field_i} %{NOTSPACE:field_j} ('%{NOTSPACE:field_k}'|%{NOTSPACE:field_k}) %{NOTSPACE:field_l} ('%{GREEDYDATA:field_m}'|%{NOTSPACE:field_m}) ('%{GREEDYDATA:field_n}'|%{NOTSPACE:field_n}) ('%{GREEDYDATA:field_o}'|%{NOTSPACE:field_o}) ('%{GREEDYDATA:field_p}'|%{NOTSPACE:field_p})" }
			}
			if [field_d] == "-" {
				mutate {
					replace => [ "field_d", "-1" ]
				}
			}
			if [field_f] == "-" {
				mutate {
					replace => [ "field_f", "-1" ]
				}
			}
			mutate {
				add_field => {
					"logtime" => "%{field_a} %{field_b}"
				}
				remove_field => [ "field_a", "field_b" ]
				gsub => ["field_g","\'",""]
			}
			date {
				match => [ "logtime", "YYYY-MM-dd HH:mm:ss" ]
				timezone => "GMT"
				target => "@timestamp"
			}
		}
	}else if[logtype] == "logfile_c"{
		grok {
			match => { "message" => "^##(?<level>(levelA|levelB|levelC|levelD|levelE)) (?<timestamp>%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME}) %{NOTSPACE:field_a} %{NOTSPACE:field_b} %{GREEDYDATA:field_c}" }
		}
		date {
			match => [ "timestamp", "dd-MMM-yyyy HH:mm:ss.SSS" ]
			timezone => "UTC"
			target => "@timestamp"
		}
	}
}

output{
	if[logtype] == "LOGFILE"{
		if[fields][logtype] == "logfile_a"{
			elasticsearch {
				hosts => ["192.168.0.1:9200","192.168.0.2:9200","192.168.0.3:9200"]
				index => "indexa-%{+YYYY-MM-dd}"
				user => elastic
				password => elasticpassword
			}
		}else if[fields][logtype] == "logfile_b"{
			elasticsearch {
				hosts => ["192.168.0.1:9200","192.168.0.2:9200","192.168.0.3:9200"]
				index => "indexb-%{+YYYY-MM-dd}"
				user => elastic
				password => elasticpassword
			}
		}else if[logtype] == "logfile_c"{
			elasticsearch {
				hosts => ["192.168.0.1:9200","192.168.0.2:9200","192.168.0.3:9200"]
				index => "indexc-%{+YYYY-MM-dd}"
				user => elastic
				password => elasticpassword
			}
		}
	}
}

magnusbaeck · March 7, 2017, 12:10pm

50 events/second shouldn't cause any CPU spikes on any reasonably recent machine. Not sure what's up here.

Christian_Dahlqvist · March 7, 2017, 12:28pm

Your first grok pattern has 11 occurrences of GREEDYDATA, which will be very inefficient. I saw a grok pattern later in the config that had a similar structure but instead used NOTSPACE. Would it be possible to replace GREEDYDATA with NOTSPACE in the first grok pattern as well?

elasticheart · March 7, 2017, 2:19pm

Above you can see my entire configuration and if you want some trace or settings, I can give you. Thanks for your help.

system · April 4, 2017, 2:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why is my Logstash using huge CPU and how to fix it? Logstash	6	2360	June 12, 2017
Logstash "idle" CPU is 100% Logstash	3	648	September 13, 2019
Logstash-filter-metrics used 100% CPU? Logstash	3	658	April 21, 2017
Kafka Input Performance Problems Logstash	1	450	May 15, 2020
High Logstash CPU Usage Logstash	2	1076	July 6, 2017

Why my Logstash is consuming more than 80% of CPU?

Related topics