Correct way of Logstash persisted queue performance testing

alexpanas · February 19, 2019, 10:33pm

Hello,

I am trying to measure Logstash 6.6 performance to find the performance impact of enabling persisted queues on Logstash 6.6.

To eliminate any influence of elasticsearch environment - all tests are being outputted to /dev/null and there are no additional filters configured in logstash:

output {
file {
path => "/dev/null"
}
}
I have tried using following input configurations:

TCP input

input {
tcp {
port => 5000
codec => json
}
}
generator plugin

input {
generator {
lines => [JSON log lines go here]
count => 100000
threads => 2
}
}
stdin input

input {
stdin {
}
}

I have tested both memory and persisted queues for each of mentioned types of input(TCP, stdin and generator plugin). Persisted memory queue is located on a tmpfs partition in memory to eliminate any influence of HDD performance.
I use monitoring API and Grafana to measure the number of events that are being processed by logstash - using :9600/_node/stats/pipelines/ and checking the events "in" and "out" stats for measurements
I have used the same corpus of log lines for sending them via stdin and TCP connections (via the same python script) - and the same lines for generator plugin
Empirically I was able to see that best performance for my environment (for each queue type) appears to be achieved with following settings:

For queue type memory:

  queue.type: memory
  pipeline.workers: 2
  pipeline.output.workers: 2
  pipeline.batch.size: 4000
  pipeline.batch.delay: 50

For queue type persisted:

  queue.type: persisted
  path.queue: "/var/log/logstash/data/memory/"
  queue.page_capacity: 10mb
  queue.drain: false
  queue.max_events: 0
  queue.max_bytes: 1.5gb
  queue.checkpoint.writes: 0
  queue.checkpoint.acks: 0
  pipeline.workers: 2
  pipeline.output.workers: 2
  pipeline.batch.size: 130
  pipeline.batch.delay: 50

As can be seen from the results below enabling persisted queue significantly decreases pure logstash performance:

stdin input + memory queue 104.6 K events/s max
generator input + memory queue 119.0 K events/s max
TCP input + memory queue 36.2 K events/s max
stdin input + persisted queue 35.9 K events/s max
generator input + persisted queue 48.4 K events/s max
TCP input + persisted queue 19.2 K events/s max

Please note that log lines corpus, pipeline worker and batch size configurations and JVM settings remained the same between "memory" and "persisted" queue tests of the same input.

Is such performance impact expected for persisted queue - or something is off in my tests? What is recommended way of comparing memory and persisted queue performance?

danhermann · February 19, 2019, 11:16pm

@alexpanas, your testing methodology looks pretty good to me and the throughput ratio that you are seeing between memory queues and persisted queues is similar to the ratio we've seen in our internal testing. At its current state of development, the persistent queues feature does have a performance cost.

alexpanas · February 19, 2019, 11:34pm

@danhermann, thank you for confirmation.

Are there any recommendations (besides queue.checkpoint.writes: 0 and queue.checkpoint.acks: 0 ) to improve persisted queue performance - or the ratio I was able to get is pretty much the best it could be?

Am I also understanding correctly that the throughput achievable with queue.checkpoint.writes: 1 and queue.checkpoint.acks: 1 can be used as a worst-case scenario marker for worst possible performance of persisted queue?

danhermann · February 20, 2019, 1:09pm

@alexpanas, those are the two main software settings that affect PQ performance and yes, setting them both to 1 is definitely the worst-case scenario. Beyond those two, we've found that increasing queue.page_capacity beyond the default 64MB is generally counterproductive. And finally, of course, hardware matters so placing the PQ files on a fast SSD will perform better than placing them on a slower HDD.

alexpanas · February 20, 2019, 7:13pm

@danhermann , thanks for the info.

I have also noticed that pipeline.batch.size tends to behave differently for memory and persisted queues - increasing this parameter to 4000 increased the performance of memory queue - but significantly decreased the performance of persisted queue (generator input was able to push only 39.4 K events/s max instead of 48.4 K for 130 value I ended up using in my tests)

Is this expected? Are there any guidelines on how to calculate optimal value of this parameter without trial-and-error process (which would require experimenting on production environment, which I am not fond of)?

danhermann · February 20, 2019, 9:16pm

@alexpanas, it is certainly true, as you've found, that higher batch sizes tend to work better for memory queues. With smaller batch sizes, we see significant lock contention on the memory queues which reduces throughput.

It's harder to find optimal sizes with the PQ, though. Generally speaking, they are smaller, but the optimal size depends a lot on the distribution of input events and how quickly they can be processed on the output side. I would suggest experimenting in a QA or staging environment to find the optimal batch size there. I would then use that same batch size in production and if your throughput in production was similar to what you observed in your QA or staging environment, I would assume that you are very close to the optimal batch size in your production environment as well.

system · March 20, 2019, 9:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Something want to know about logstash persisted queue Logstash	3	3206	December 27, 2017
Logstash persistent queue settings Logstash	7	2013	October 12, 2021
Logstash persistent queue empty but all events coming in fine Logstash	6	1146	April 27, 2021
How to get in memory queue statiscic Logstash	2	3037	October 22, 2018
Persistent Queues delay output Logstash	4	1073	April 10, 2018

Correct way of Logstash persisted queue performance testing

Related topics