Logstash consuming lot of CPU

Saket_Kumar · December 25, 2017, 7:52am

I am using Logstash 2.4 to process multiple files, therefore to process all files at time; launching multiple logstash instances using "logstash -f file.conf < file1" on a Windows Server. I don't need to tail and these files has to be processed once.

Tried options by tweaking LS_HEAP and Number of workers but no improvement.

I am having Elasticsearch also running on same server. Due 99% CPU utilization other processes are hampered.

Looking out for optimization options available to optimize CPU utilization by Logstash.

Expected load is too high as there can be 100 files to process at the same time and with level of CPU utilization, it would be difficult to manage.

Please suggest better option if any.

warkolm · December 25, 2017, 8:03am

Why are you running such an old version? The latest is 6.1.1, you should really upgrade.

What JVM are you using? What does your config file look like?

Saket_Kumar · December 25, 2017, 8:44am

I am using 1.8 JVM and the config file looks like as follows:

input {
stdin
}
filter {
if ([message] =~ "responseCode") {
drop { }
} else {
csv {
separator => ","
columns => ["timeStamp", "CompTime", "label", "Code", "Response", "threadName", "dataType", "success", "failureMessage", "bytes", "VUsers", "Vuser_all", "URL", "TTFB", "Encoding", "SampleCount", "ErrorCount", "Hostname", "ThinkTime", "ConnectionTime"]
}
}
grok
mutate {
split => { "filename" => "_" }
add_field => { "Project" => "%{[filename][0]}" }
add_field => { "RunID" => "%{[filename][1]}" }
}
date {
locale => "en"
match => ["timeStamp", "yyyy/MM/dd HH:mm:ss.SSS", "UNIX_MS"]
target => "timeStamp"
timezone => "Asia/Kolkata"
}
mutate {
split => { "label" => "-" }
add_field => { "Scenario" => "%{[label][0]}" }
add_field => { "Transaction" => "%{[label][1]}" }
add_field => { "Request" => "%{[label][2]}" }
}
if [Transaction] == "%{[label][1]}" {
mutate { replace => { "Transaction" => "NULL" }}
}
if [Request] == "%{[label][2]}" {
mutate { replace => { "Request" => "NULL" }}
}

if [success] == "true" or [success] == "TRUE" {
mutate { add_field => { "PassCount" => "1" }}
mutate { add_field => { "FailCount" => "0" }}
}
if [success] == "false" or [success] == "FALSE" {
mutate { add_field => { "PassCount" => "0" }}
mutate { add_field => { "FailCount" => "1" }}
}
ruby {
	code => "
			event['ServeTime'] = event['CompTime'].to_i-event['TTFB'].to_i
			"
	}
   ruby {
    code => "
            vartime = ENV['envtime']
            if (vartime.nil?)
                StartT = event['timeStamp'].to_i
                EndT = event['timeStamp'].to_i
                ENV['envtime'] = StartT.to_s
                diff = EndT - StartT
                event['RT'] = Time.at(diff.to_i.abs).utc.strftime '%H:%M:%S'
            else 
                StartT = vartime.to_i
                EndT = event['timeStamp'].to_i
                diff = EndT - StartT
                event['RT'] = Time.at(diff.to_i.abs).utc.strftime '%H:%M:%S'
            end
            "
        }
   	date { 
locale => "en"
match => ["RT", "HH:mm:ss"]
target => "RelativeTime"
timezone => "Asia/Kolkata"
} 	
mutate {convert => ["CompTime", "integer"]}
mutate {convert => ["ServeTime", "integer"]}
mutate {convert => ["Code", "string"]}
mutate {convert => ["bytes", "integer"]}
mutate {convert => ["VUsers", "integer"]}
mutate {convert => ["Vuser_all", "integer"]}
mutate {convert => ["TTFB", "integer"]}
mutate {convert => ["SampleCount", "integer"]}
mutate {convert => ["ErrorCount", "integer"]}
mutate {convert => ["ThinkTime", "integer"]}
mutate {convert => ["PassCount", "integer"]}
mutate {convert => ["FailCount", "integer"]}
mutate {convert => ["ConnectionTime", "integer"]}
mutate {lowercase => ["Project"]}

}
output {
elasticsearch {
action => "index"
hosts => "localhost:9200"
index => "logstash-%{Project}-%{+YYYY.MM.dd}"
}
stdout {}
}

arisbanach · December 27, 2017, 12:56am

why are you using logstash 2.4 though? we're up to 6.1

warkolm · December 27, 2017, 1:05am

Definitely this, there are a number of improvements to performance you would benefit from.

magnusbaeck · December 27, 2017, 10:10am

Tried options by tweaking LS_HEAP and Number of workers but no improvement.

Lowering the number of pipeline workers to one will (for a multi-core system) limit the amount of CPU used by Logstash.

I am having Elasticsearch also running on same server. Due 99% CPU utilization other processes are hampered.

You should consider running Logstash at a lower priority to reduce its impact on the system's performance. 99% CPU utilization typically isn't a problem if the process(es) using the CPU are pre-empted by just about any other process.

The throttle filter can also be helpful.

Saket_Kumar · January 3, 2018, 9:47am

Sure i will give try to these... Thanks.

Saket_Kumar · January 3, 2018, 9:50am

I am using multi line filter plugin. Which is deprecated in 5.x onward.
I gave a try but faced limit issue for event "max_lines =>". I changed it but no success. I want to process entire line it was restricting till 900. The XML file which I am processing contains more than thousand lines.

system · January 31, 2018, 9:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash high CPU usage Logstash	14	1131	September 20, 2018
Logstash CPU utilization is high Logstash	40	3663	June 25, 2018
High CPU load when using Logstash file input on Windows Elasticsearch	2	1012	July 6, 2017
Maximising Logstash CPU Utilisation Logstash	2	434	March 11, 2019
Logstash Using a lot of CPU Logstash	2	704	August 28, 2017

Logstash consuming lot of CPU

Related topics