Logstash plugin elasticsearch performance issue

sepibe · May 31, 2017, 3:11pm

Hello,

I'm using a Filebeat->Logstash->Elasticsearch configuration to monitor and keep trace of different logs in my system. Everythin works correctly and CPU usage is around 20%.

There are 4 different types of log files and lately we have decided to enrich one of them with information found on previous entries from the other 3 logs.
Using the elasticsearch filter plugin does the trick but raises CPU usage to something near 100%. I suppose this is because of the high number of queries done by the filter. If so, is there another way to enrich data while keeping a CPU usage at a healthy value?

Can you think of some possible enhancements that help me improve performance? i tried multiple things like changing number of worker threads for logstash, optimizing heap memory to reduce garbage collection, pre filtering the queries...

Maybe the problem is somewhere else and i can't see it.

Thanks in advance for your help

guyboertje · June 2, 2017, 8:51am

Please take about 10 stack dumps 1 second or so apart. Put them in S3 or Github Gist etc. and post the links here. When this post is done you can delete the files if you wish.

With this we can see where LS is spending its time and help/optimise further.

sepibe · June 2, 2017, 12:12pm

Hello,

I don't have access to elasticsearch logs because log.level is INFO and i can't change it right now. I have created a gist and included about 6 _node/hot_threads results. I have also included a dump from logstash-plain.log where you can see 689 queries done in a timespan of 10 sec (As a i said, i belive this to be the problem).

If more info is needed tell me please. Next week i will be able to capture data from every log since we are stopping the elastic stack by the end of the day, wich will give me a window to change logging options.

gist.github.com

https://gist.github.com/anonymous/a3243ae4a06f87d7b64ef011650e3811

HOT_THREADS.txt

::: {1NmGdka}{1NmGdkarQBC5sNAR4-elOA}{yHK9VL4QRoy3NfKKnrdXQw}{16.17.101.117}{16.17.101.117:9300}
   Hot threads at 2017-06-02T11:06:47.963Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   77.7% (388.2ms out of 500ms) cpu usage by thread 'elasticsearch[1NmGdka][search][T#16]'
     3/10 snapshots sharing following 30 elements
       sun.reflect.Reflection.getCallerClass(Native Method)
       java.lang.Class.newInstance(Class.java:397)
       org.apache.logging.log4j.spi.AbstractLogger.createDefaultMessageFactory(AbstractLogger.java:212)
       org.apache.logging.log4j.spi.AbstractLogger.<init>(AbstractLogger.java:128)
       org.apache.logging.log4j.spi.ExtendedLoggerWrapper.<init>(ExtendedLoggerWrapper.java:44)

This file has been truncated. show original

Logstash_10secs.txt

[2017-06-02T13:06:29,940][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:29,965][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:29,966][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:29,994][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:29,995][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:29,997][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:30,006][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:30,053][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:30,053][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}
[2017-06-02T13:06:30,054][INFO ][logstash.filters.elasticsearch] Querying elasticsearch for lookup {:params=>{:index=>"", :q=>"mwfm_jobid:17886 AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*", :size=>1, :sort=>"@timestamp:desc"}}

This file has been truncated. show original

Thanks a lot in advance.

guyboertje · June 2, 2017, 1:37pm

Sorry, I should have been more specific.
You say the elasticsearch filter plugin causes 100% cpu usage. I automatically thought that this applied to Logstash JVM.

You got hot thread dumps from Elasticsearch, does this mean the high CPU utilisation is in Elasticsearch?

If raw indexing speed is less important than hammering ES then consider the throttle filter before the ES filter.

sepibe · June 6, 2017, 1:09pm

The proccess behind the cpu consumption is elasticsearch JVM. It is true that i said i am having problems with the elastic search filter plugin for logstash but, somehow, i assumed this was causing problems in the elasticsearch JVM and not the logstash JVM itself, since, esentially it is just invoking elasticsearch to do some searches. Are these assumptions wrong? am i mistaken?

the relevant part of the logstash.conf is:

#############################
#### ADD JOB ID TO MWFM #####
#############################
	if([type]=="mwfm"){
		elasticsearch{
		    hosts => ["NEW_OUTPUT_HOST:9200"]
			query => "mwfm_jobid:%{[mwfm_jobid]} AND type:nfvd-wf"
			fields => { "jobID" => "jobID" }
		}
		ruby {
			code => "event.set('jobID', event.get('jobID')) rescue nil"
		}
		if([mwfm_jobid]){
			elasticsearch{
				hosts => ["NEW_OUTPUT_HOST:9200"]
				query => "mwfm_jobid:%{[mwfm_jobid]} AND type:mwfm AND _exists_:wf_name AND mymessage:workflow*jobid*"
				fields => { "wf_name" => "wf_name" }
			}
			ruby {
				code => "event.set('wf_name', event.get('wf_name')) rescue nil"
			}
		}
	}

When i remove that from logstash.conf performance gets way better (but i lose enrichment capabilities)

guyboertje · June 12, 2017, 11:49am

Are you doing two ES lookups per event?
On the surface, it seems so. This will be a performance bottleneck as well as hammering your ES cluster.
Is there a better way to structure your LS conditional logic?
I don't understand why you need the ruby filter section.
Is there a way you can copy the ES documents with more denormalisation to another secondary cluster?

sepibe · June 12, 2017, 2:09pm

I'm afraid i am. I must check if new entry qualifies and then, if it did, change something and do another check.

I'm starting to thing this is not a "configuration" problem, just me asking too much for a 1 cluster, 1 node enviroment.

I would say there isn't but, you know, human failure and all that. Still trying to figure out a way around querying elasticsearch so many times.... something in the lines of file output+translate filter to mimic a LRU.

I believe it is where i actually add new info to the event, wich this si all about (log enrichment). But i could be mistaken (pretty new to ruby and plugins)

Thanks a lot for your help, i really appreciate it.

guyboertje · June 12, 2017, 2:23pm

please post a typical event before enrichment - i.e. the rubydebug encoded version via the stdout output?
please post examples of the two kinds of documents in ES that are the query targets.

I don't think you need the ruby filters. If you can event.get("a_field") then its already in the event. There is no point to get the value just to set it back to the same place again.

system · July 10, 2017, 2:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash elasticsearch filter causing high CPU usage Logstash	3	401	July 17, 2020
High cpu usage by worker thread Logstash	8	4052	July 6, 2017
High CPU usage of elasticsearch Elasticsearch	6	2421	July 6, 2017
Logstash consuming lot of CPU Logstash	8	2477	January 31, 2018
Logstash slows down overtime Logstash	15	1019	July 4, 2019

Logstash plugin elasticsearch performance issue

Related topics