How to improve Logstash performance imrpovement

I have 5 node cluster as follows

4 nodes with each 16 GB RAM 4 Cores - ES
1 node with 32 GB RAM 8 cores - Logstash

I am able to process 550000 records per one minute, I want improve it to more.

So what changes I can do,

I did following changes in /etc/default/logstash

Arguments to pass to logstash agent

LS_OPTS="-b 1000 -w 16"

Arguments to pass to java

LS_HEAP_SIZE="28g"
#LS_JAVA_OPTS="-Djava.io.tmpdir=$HOME"
LS_JAVA_OPTS="-Xmx30g -Xms2G"

also in bin/logstash

LS_HEAP_SIZE="xxx" size for the -Xmx${LS_HEAP_SIZE} maximum Java heap size option, default is "1g"

LS_HEAP_SIZE="28g"

LS_JAVA_OPTS="xxx" to append extra options to the defaults JAVA_OPTS provided by logstash

LS_JAVA_OPTS="-Xmx30g -Xms2g"

but still i don't see any improvement in logstash performance.

Could you please provide some tips to improve Logstash performance.

Thanks,
Uday.K

What's the current bottleneck? CPU? Elasticsearch ingest rate?

Hi Magnus,

Thanks for your reply.

the bottleneck is ingest rate.

Thanks,
Uday.K

If Elasticsearch is the bottleneck then I don't see how tweaking Logstash will help.

sorry I mean logstash ingestion, 8 cores of cpu is using more than 80% each.
even I increase heapsize no change in cpu usage.

I already posted my config details..

What event rate can you reach if you replace the elasticsearch output with e.g. a file output? What event rate can you reach if you disable all filters? What if you do both?

Hi Magnus,

Sorry for late reply,

  1. If I use file output with filter, not much change. rate is 570 K.
  2. If I remove filter and use file output. rate is around 585 K.
  3. If I remove multiline codec in input and remove filter and use ES output. rate is around 968 K.

From above I see that slow rate is because of multiline codec in Logsatsh input. Is there any way to improve performance for input with multiline codec.

Please suggest if any other changes I can do in my config.

Thanks,
Uday.K

The multiline codec is doing its job.
Say you have a 1000 lines per second input then you have 1000 events per second. If the number of lines that belong to multiline is 50% then you will have 500 events per second. The same work is done, just that the ML codec emits less events.

Thanks Guyboertie for your reply

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.