Very low CPU/Mem usage in logstash server

caoping · September 28, 2016, 7:31am

Hi,

My logstash server has 4 CPUs and 16g Mem, and 4 pipeline-workers (It's said the number of pipeline-worker should be the number of CPUs. ). However, after running more than 20 hours, I found it said:

Beats input: The circuit breaker has detected a slowdown or stall in the pipeline...

Then I changed 'congestion_threshold' and 'timeout' to a much larger number by referring to the solution on the Internet, and restarted logstash. Unfortunately, also after about 20+ hours, the same issue occurred.

I checked the CPU/Mem usage with top command, and found CPU was about 1%, and Mem was about 3%. Very low usage.

I'm not sure why the pipeline was blocked under such low CPU/Mem usage. And what else can I do to avoid that issue? Any help will be appreciated.

Christian_Dahlqvist · September 28, 2016, 7:42am

The recommended settings for these parameters have changed over the last few releases. Which version of Logstash are you using? What does your config look like?

caoping · September 28, 2016, 9:20am

Sorry, I forgot to add logstash version information. The logstash version is 2.3.4

caoping · October 11, 2016, 8:41am

Hi,

My logstash config is:

input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash.crt"
ssl_key => "/etc/pki/tls/private/logstash.key"
}
}

filter {
grok {
match => {
"message" => [
"[(?(%{MONTHNUM}/%{MONTHDAY}/%{YEAR})\s+%{TIME}\s+%{WORD})]\s+%{BASE16NUM:ThreadID}\s+(?([\w|\S]+))\s+%{WORD:LogLevel}\s+(?[\w|\W](?(SR[A-Za-z\d][\d]+))[\W]+[\w|\W])",
"[(?(%{MONTHNUM}/%{MONTHDAY}/%{YEAR})\s+%{TIME}\s+%{WORD})]\s+%{BASE16NUM:ThreadID}\s+(?([\w|\S]+))\s+%{WORD:LogLevel}\s+(?[\w|\W](\n)+(?(SR[A-Za-z\d][\d]+))(\n)+[\w|\W])"
]
}
remove_field => ["message"]
}
if "_grokparsefailure" in [tags] {
grok {
match => ["message", "[(?(%{MONTHNUM}/%{MONTHDAY}/%{YEAR})\s+%{TIME}\s+%{WORD})]\s+%{BASE16NUM:ThreadID}\s+(?([\w|\S]+))\s+%{WORD:LogLevel}\s+(?[\w|\W])"]
remove_field => ["message"]
remove_tag => ["_grokparsefailure"]
add_field => {
SRNumber => "-"
}
}
}
if "_grokparsefailure" in [tags] {
grok {
match => ["message", "[(?(%{MONTHNUM}/%{MONTHDAY}/%{YEAR})\s+%{TIME}\s+%{WORD})]\s+%{BASE16NUM:ThreadID}\s+%{WORD:LogLevel}\s+(?[\w|\W])"]
remove_field => ["message"]
remove_tag => ["_grokparsefailure"]
add_field => {
SRNumber => "-"
LogSOurce => "-"
}
}
}
if "_grokparsefailure" in [tags] {
grok {
match => ["message", "(?[\w|\W]+)"]
remove_field => ["message"]
remove_tag => ["_grokparsefailure"]
add_tag => ["ignore"]
add_field => {
LogSource => "-"
LogLevel => "-"
SRNumber => "-"
LogTime => "-"
ThreadID => "-"
}
}
}
if "SWIS" in [fields][ServerType] {
date {
match => ["Logtime", "M/d/yy HH:mm:ss:SSS z"]
timezone => "GMT"
}
} else {
date {
match => ["Logtime", "M/d/yy HH:mm:ss:SSS z"]
timezone => "UTC"
}
}
}
output {
elasticsearch {
hosts => "IP"
index => "logstash-site-%{+YYYY.MM.dd}"
flush_size => 50
}
}

After increasing the number of pipeline workers from 4 to 8, logstash had been working well for about 10 days. It's a good progress compared to before it only worked well for less than 1 days.

Is anything wrong with my configuration? If yes, any suggestions for that? This issue has bothered my for a long time, I'll be appreciated for your help.

Christian_Dahlqvist · October 13, 2016, 5:28am

Why have you specified such a small value? I would stick with the default values as this could have an impact on performance as it will require additional round trips to Elasticsearch.

Although I don't know exactly what your data looks like and what proportion of data that is matched by the various patterns, it looks like all patterns start with the same sequence: [(?(%{MONTHNUM}/%{MONTHDAY}/%{YEAR})\s+%{TIME}\s+%{WORD})]\s+%{BASE16NUM:ThreadID}\s+

It may be more efficient to capture this in one grok filter and use a GREEDYDATA to capture the rest of the message into a separate variable that can then be matched against the various scenarios.

caoping · October 17, 2016, 2:28pm

Yes, you're right. The flush_size is too small. I didn't notice that before your reminder. Thank a lot.

Your suggestion on grok also makes sense, but after I update the grok filter, there seems no improvements on logstash performance. It met the pipeline slowdown issue after several hours running. I may need to revise my grok configuration.

Topic		Replies	Views
Very low cpu usage in logstash server Logstash	3	896	April 12, 2019
Logstash doesn't use all CPU available Logstash	14	4234	December 15, 2016
Logstash Consuming high CPU Logstash	12	3688	April 28, 2018
Logstash cpu usage blocked at 50% Logstash	3	524	November 15, 2021
Pipeline is full, she can't take much more captain! Beats winlogbeat	3	1277	December 27, 2016

Very low CPU/Mem usage in logstash server

Related topics