CPU usage for logstash hits over 300%

powerinside · November 12, 2020, 9:07am

Hello everybody,

I am trying to parse logs using multiple grok filters. I have noticed that if i add following pattern, CPU usage for logstash hits over 300%. I have a couple of grok patterns for nginx logs as well without a problem - logstash cpu usage flows at average 10-30%. is it possible to modify this pattern in a way or alternative solution to overcome high cpu problem.
Grok Pattern:

        else if [log][file][path] =~ "tomcat8" {
 grok {
 match => { "message" => ["%{DATA:[tomcat][event][date]} %{DATA:[tomcat][event][time]}\s*\ %{DATA:[tomcat][event][level]} %{DATA:[tomcat][event][server]} \--- \[%{DATA:[tomcat][event][logger]}\] %{DATA:[tomcat][event][variable]}\s*\ \: %{GREEDYDATA:[tomcat][event][message]}"] }
  remove_field => "message"
      }
      }

Following is a sample log entry matches for this pattern:
message >
2020-11-10 03:39:15.979 INFO srv-prj-prod1 --- [veu-3108-exec-6] x.b.c.r.y.DceProfiler : execution-time: 256 ms - http://127.0.0.1:3108/privacyManagement/partyPrivacyProfile?partyPrivacyProfileCharacteristic[0].contactMedium.value - prod.backend.thirdparty

powerinside · November 12, 2020, 11:19am

my logstash.yml file has no settings . All starts with #

I think Pipeline Settings can solve this. Can anyone redirect me to some page explains best practices for pipeline settings.
Thanks

# ------------ Pipeline Settings --------------
#
# The ID of the pipeline.
#
# pipeline.id: main
#
# Set the number of workers that will, in parallel, execute the filters+outputs
# stage of the pipeline.
#
# This defaults to the number of the host's CPU cores.
#
# pipeline.workers: 2
#
# How many events to retrieve from inputs before sending to filters+workers
#
# pipeline.batch.size: 125
#
# How long to wait in milliseconds while polling for the next event
# before dispatching an undersized batch to filters+outputs
#
# pipeline.batch.delay: 50
#
# Force Logstash to exit during shutdown even if there are still inflight
# events in memory. By default, logstash will refuse to quit until all
# received events have been pushed to the outputs.
#
# WARNING: enabling this can lead to data loss during shutdown
#
# pipeline.unsafe_shutdown: false
#
# ------------ Pipeline Configuration Settings --------------

BenB196 · November 12, 2020, 2:46pm

That pattern is extremely resource intensive. Having many DATA and GREEDYDATA grok patterns mixed together takes a lot of resources. You may want to look into Dissect: https://www.elastic.co/guide/en/logstash/current/plugins-filters-dissect.html. Since you only have a singular pattern it would be much more efficient.

Badger · November 12, 2020, 3:02pm

Patterns like DATA, and especially GREEDYDATA are very expensive when they do not match, since they result in a lot of backtracking.

Try to modify your pattern to avoid them. Perhaps you can use NOTSPACE or something else that is cheaper. Or use dissect.

powerinside · November 12, 2020, 6:57pm

Thanks for your suggestions!
I changed my grok into following format it looks ok in debug tool. i couldn't change last DATA at the end of pattern because of matching fails. This is a production environment so i dont have much chance to try out easily but will let you know the result when this change implemented. Btw, other suggestions would be greatly appreciated!

^%{NOTSPACE:[tomcat][event][date]} %{NOTSPACE:[tomcat][event][time]}\s*\ %{NOTSPACE:[tomcat][event][level]} %{NOTSPACE:[tomcat][event][server]} \--- \[%{NOTSPACE:[tomcat][event][logger]}\] %{NOTSPACE:[tomcat][event][variable]}\s*\ \: %{DATA:[tomcat][event][message]}$

powerinside · November 25, 2020, 2:28pm

it worked! To make change DATA with NOTSPACE solved high cpu problem.

system · December 23, 2020, 2:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Grok filter help! Logstash	5	526	September 15, 2018
Optimize grok filter Logstash	8	1400	September 13, 2019
Logstash filter taking. high cpu Logstash	1	229	October 14, 2022
Grok increase CPU usage to 700+% Logstash	3	566	December 23, 2020
Logstash performance drop with high CPU usage Logstash	2	935	July 6, 2017

CPU usage for logstash hits over 300%

Related topics