Ruby filter - using require caused CPU spike

blacklobo · May 18, 2018, 8:41pm

I am curious if others have had this problem and/or if someone understand WHY this occurs. I had the below bit of code in a Logstash instance that was consuming from one Kafka cluster performing some filtering and then output to another Kafka cluster. Under BAU operations this was working fine. However if the instance was restarted with from earliest a condition would occur that caused the CPU usage to spike and consume nearly all CPU cycles on the server. We figured out that the "required" command is NOT actually needed and removed it from the configuration to resolve the problem. However we were hoping someone shed some light on why this was occurring.

ruby {
code => "
require 'date';

  	# Convert event time to epoch
  	event.set('[@metadata][current_time]', Time.now.to_datetime.strftime('%Q'));
  "

}

Below is what we observed while this was occurring.

All the worker threads were constantly running
The worker threads being running blocked the Kafka consumer threads from being able to consume data quickly
We ran a number of tests trying different combinations
- Removing other parts of the filter and the only thing that made any difference was the "require 'date'"
- Changing plugin options for batch size and delay which made no meaningful change
- Changing Kafka consumer options poll timeout, max poll records, fetch wait ms, etc but they made no meaningful change either

Below is part of a thread dump, showing the first few lines of a worker while this problem was occurring. In it you can see "jnr.posix.JavaSecuredFile.exists" near the top. When we performed a CPU sample in Java VisualVM that was taking up about 70-80% of the CPU.

"Ruby-0-Thread-17@[main]>worker8: :1" - Thread t@56
java.lang.Thread.State: RUNNABLE
at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
at java.io.File.exists(File.java:819)
at jnr.posix.JavaSecuredFile.exists(JavaSecuredFile.java:97)
at org.jruby.util.RegularFileResource.exists(RegularFileResource.java:72)
at org.jruby.runtime.load.LibrarySearcher.findFileResourceWithLoadPath(LibrarySearcher.java:169)
at org.jruby.runtime.load.LibrarySearcher.findResourceLibrary(LibrarySearcher.java:129)
at org.jruby.runtime.load.LibrarySearcher.findLibrary(LibrarySearcher.java:65)
at org.jruby.runtime.load.LibrarySearcher.findBySearchState(LibrarySearcher.java:54)
at org.jruby.runtime.load.LoadService.findLibraryBySearchState(LoadService.java:973)
at org.jruby.runtime.load.LoadService.findFileForLoad(LoadService.java:396)
at org.jruby.runtime.load.LoadService.smartLoadInternal(LoadService.java:495)
at org.jruby.runtime.load.LoadService.require(LoadService.java:402)
at org.jruby.RubyKernel.requireCommon(RubyKernel.java:962)
at org.jruby.RubyKernel.require19(RubyKernel.java:955)

Environment:

VM with 32 cores & 48GB RAM
Logstash 6.2.2
Input Kafka 8.0.4
Output Kafka 6.2.2
Filter Ruby 3.1.3

magnusbaeck · May 22, 2018, 10:44am

Move require 'date' from the ruby filter's code option to the init option.

system · June 19, 2018, 10:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.