High cpu and Beats input: the pipeline is blocked, temporary refusing new connection

when I start logstash after about 5 minutes ,the log :slight_smile:


the cpu 100% :slight_smile:

the config:

the ela is normal:

Which Java process is hogging the CPU, Logstash or Elasticsearch?

Please don't post pictures of text, they are difficult to read and some people may not be even able to see them.

logstash ,it published on a server alone
there are eight worker processes hogging the cpu:
%cup command
100.2 [main] > worker0
100.2 [main] > worker2
100.2 [main] > worker5
99.9 [main] > worker1
99.9 [main] > worker3
99.9 [main] > worker4
99.9 [main] > worker6
99.9 [main] > worker7

Logstah:insertToolong queue and the pipeline is blocked,temporary refusing new connection

the kopf of elasticsearch :
load average: 0.0
cpu% : 2.0
heap useage % : 44.0

I find the grok filter lead to the high cpu by check out the jstack:
at rubyjit.LogStash::Filters::Grok$$filter_428454220f2f91b7ec1ad9e019332db0555035611028566121.block_0$RUBY$file(/usr/local/elk/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-2.0.5/lib/logstash/filters/grok.rb:279)
at rubyjit$LogStash::Filters::Grok$$filter_428454220f2f91b7ec1ad9e019332db0555035611028566121$block_0$RUBY$file.call(rubyjit$LogStash::Filters::Grok$$filter_428454220f2f91b7ec1ad9e019332db0555035611028566121$block_0$RUBY$file)

my config:
input {
beats {
port => 5044
type => "xzb_logs"
congestion_threshold => 10
}
}
filter{
if[type] == "xzb_logs"{
grok{
match =>{
message => "(?[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} (?:Z|+-(?::?(?:[0-5][0-9])))) (?[\w])/(?[\w]) (?[\w]) (?[\d\w]+-[\d\w]+)/(?\w+(.[\w]+)) %{WORD:LOGLEVEL}/(?[^:]):(?[\s\S])"
}
}
date{
match => ["TIME","yyyy-MM-dd HH:mm:ss Z"]

            }
    }

}
output {
elasticsearch {
hosts => ["10.10.45.45:9200"]
index => "logstash-%{type}-%{+YYYY.MM.dd}"
document_type => "%{type}"
workers => 1
flush_size => 200
}
}

the log format:
[TIME] [BUSINESS/PLATFORM] [DEVID] [PID-TID/PACKAGE] [LOGLEVEL]/[TAG]:[TEXT]
example:
2016-07-10 00:42:11 +0800 xzbApp/android x8dcc5070ff395e1 20551-1/com.xunlei.timealbum I/XZBDeviceManager: requestDeviceList frefXZBDeviceManager Init

How did you notice the grok filter is maxing out your cpu??

We seem to be having the same problem over here as well. Logstash is hogging the CPU. If grok is responsible for that are there any work-arounds?

Depending on how complex your logs are, you may want to consider the new dissect filter.

@Christian_Dahlqvist thanks for the swift reply. Here's our config file, not sure if it's complex or not:

input {
2 beats {
3 port => 5000
4 host => "xx.xx.xxx.xx"
5 type => "smpp"
6 ssl => true
7 ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
8 ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
9 congestion_threshold => "50"
10 }
11 }
12 filter {
13 if [type] == "smpp" {
14 grok {
15 match => { "message" => "%{DATESTAMP:flow_date_time}%{SPACE}%{NOTSPACE:connection}%{SPACE}[from:%{NUMBER:sendfrom}]%{SPACE}[to:%{NUMBER:sendto}\ ]%{SPACE}[msg:%{NUMBER:msgno}:id:%{NUMBER:msgid}%{SPACE}sub:%{NUMBER:sub}%{SPACE}%{NOTSPACE:dlvrdstatus}%{SPACE}submit%{SPACE}date:%{NUMBER:submitdate}% {SPACE}done%{SPACE}date:%{NUMBER:donedate}%{SPACE}stat:%{WORD:status}%{SPACE}err:%{NUMBER:err}%{SPACE}\Text:-]"
16 }
17 break_on_match => false
18 match => { "message" => "%{DATESTAMP:flow_date_time}%{SPACE}[%{NOTSPACE:connection}]%{SPACE}[from:%{NUMBER:sendfrom}]%{SPACE}[to:%{NUMBER:send to}]%{SPACE}[msg:%{GREEDYDATA:msg}]"
19 }
20 }
21 }
22 if [type] == "smpp" { if [msgno] == "103" { mutate { add_tag => "dn" } } }
23 if [type] == "smpp" { if [msgno] != "103" { mutate { add_tag => "mt" } } }
24 }
25 output {
26 elasticsearch {
27 hosts => ["something-something-b33oicrd2ovp7gmu4maowywe5e.eu-west-1.es.amazonaws.com:80"]
28 }
29 }

The reason I can't use the dissect methods is due to the fact we are still using Logstash version 2.2.4 @Christian_Dahlqvist

Logstash 5.x should be backwards compatible with Elasticsearch 2.x, so it may be worth upgrading. I have seen reports of significant performance improvements, but this naturally varies depending on the data.

1 Like

@Christian_Dahlqvist we are currently using the AWS elasticsearch service which at this time only supports Elasticsearch 1.5 and 2.3 so not quite sure if we can upgrade.

Even removing the grok filters and restarting logstash does not seem to work. Looking into logstash.log only shows the message that SIGTERM was received. I'm confused at this stage.