Logstash 7.12.0 High CPU usage

Hi All, I know that there are several open discussions on the topic already but nothing in there helped me resolving my situation.

#1 I installed Logstash on Ubuntu 20.04 for testing purposes and enabling it to run with the system startup.

#2 Afterwards I created 3 different logstash.conf files with different configuration in /usr/share/logstash/bin.
Below I'll paste only one of it. Can paste the rest if it's going to help

#3 My Logstash config file (variant 2) looks like this

input {
        # Accept input from the console.
        stdin{}
    }

filter {
    # Add filter here. This sample has a blank filter.
        if [message] =~ "Security" or [message] =~ "Info" {
                grok { match => { "message" => "(?:Z|[+-]%{HOUR}(?::?%{MINUTE})) (?<log_level>\w+) (?<host>[a-zA-Z0-9]+) (?<action>\w+:) (?<type>\w+=\w+>
        }
        if "Warning" in [message] {
                grok { match => { "message" => "(?:Z|[+-]%{HOUR}(?::?%{MINUTE})) (?<log_level>\w+) (?<host>[a-zA-Z0-9]+) (?<FW>\w+:) (?<req>\[\w+\]) (?<>
        }
        #prune { whitelist_names => ["^smac$"] }
        #prune { blacklist_names => ["^smac$"] }
        if [protocol] == "proto=TCP|" {
                mutate { update => { "protocol" => "TCP" } }
        }
}

output {
  #elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

Basically I test this configuration with an entry from a FW as a stdin and display it on the same console. Since I have three diffrent FW log levels Securiry, info and Warning I have an if condition that will apply one of two grok filters, there's plenty of room for improvement I know, i didnt use grok patterns (only a few in the beggining) since using using these patterns with the online grok debugger didn't match anything after the timestamp. My config works as a charm.

However, the CPU usage comes up to 300% even when I haven't started Logstash. Just the service is active.
I run logstash in its /bin directory with this command

./sudo logstash -f logstash2.conf

It runs successfully but with a couple of warns and errors

I read in the other threads that it's normal for Logstash the eat up CPU resources if it's not running a pipeline but even after I start it with the above command the CPU usage stays the same

Other thing that I noticed is that the PID of the Logstash frequently changes (every 5-6 seconds).
I checked /var/log/logstas/logstash-plain.log for errors and its full of two different

I also installed jmap as I read that high CPU usage might be due to heap overflows or something but this is far beyond my knowledge at this moment.

Dos anybody have a clue what I'm missing. I guess that I didn't express myself quite eloquently so if you need additional information please let me know.

Thank you in advance.

I think I'm getting somewhere.
I pasted one of my logstash (variant 1) config file in /etc/logstash/conf.d

Now the CPU usage surges now and again around 200-300% but its much better.

tailing my /var/log/logstash/logstash-plain.log file also looks much better.
And I don't manually start logstash from the command line with a particular config file, for instance
sudo ./logstash -f logstash2.conf


It just starts manually with whatever is in my logstash.conf file under /etc/logstash/conf.d

One more remark. With the logstash2.conf file I've put above the CPU usage was also high. In that file I was manually typing in the console and output to it, it seems that somehow this pipe also creates problem as waits and if no data shut itself down or something like that. But with my logstash.conf (variant 1) that I copied in /etc/logstash/conf.d it works as it seems that I have a full-fledged pipeline. This is my file.

    input {
    # Accept input from the console.
    stdin{}
    syslog {
      host => "127.0.0.1"
      port => 12345
      #codec => cef
   }
  }

filter {
    # Add filter here. This sample has a blank filter.
      grok { match => { "message" => "%{YEAR} %{MONTHNUM} %{MONTHDAY} %{TIME} %{WORD:type_of_log} (?<tz>\+\d+:\d+) (?<action>\w+(?=\:\s))(?<tbd>\S) (?<t>
        #prune { whitelist_names => ["^smac$"] }
        #prune { blacklist_names => ["^smac$"] }
}

output {
  #elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

This time I have pre-set syslog-ng that feed its internal logs to port 12345 on the same Ubuntu that the Logstash is and Logstash would listen on the same port 12345 to get the logs and output them on the console.

Anyway I believe that I still miss some stuff so please let me know some best practices or other options/configurations/etc.. and for that matter I still find it blurry to play with logstash.yml and pipeline.yml not to mention java.options, startup.options and log4j2.porperties

Hi,

Is the code you printed exactly the same as the one currently in your file ?

Because all the grok filter are incorrect.
You need to add "}} at the end. And some custom pattern are cut in the half.

Cad.

Yes, about that, they are correct, but I copy-pasted them from a mobaxterm and I only copied what fitted my screen so I basically didn't copied the whole string. Anyway as i said they work as a charm and can get my logs the way i want them and send them to Azure Sentinel for Instance.
Otherwise, thank you for pointing this out as others would be thinking the same.

If the PID is changing frequently then logstash is restarting. Starting logstash is very expensive (it can use a minute of CPU to initialize everything). Once it has started it should use very little CPU if it is not processing events.

1 Like

Thank you Badger!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.