Undersized batches produced in ES output while persistent queue is enabled


#1

I've described this problem in more details here https://github.com/elastic/logstash/issues/7243, but got no reply so far.

TLDR problem is that; enabling persistent queue in logstash will result in undersized elasticsearch output {} batches being produced. It looks like all related batch size settings have no effect at all and all data processes is just sent in the exact same size of batches as provided by beats input {} streams.

Did anyone else notice this? Is this expected?


(Christian Dahlqvist) #2

Each micro-batch in Logstash is processed and acknowledged separately, so I believe you will never see bulk requests to Elasticsearch larger than the pipeline batch size, which by default is 125. I would recommend increasing the batch size to e.g. 1000 and validate that this results in larger bulk requests to Elasticsearch.


#3

Hi @Christian_Dahlqvist

In case I was unclear in my first post:

My related logstash.yml settings are:

  pipeline.batch.size: "4096"

and this in elasticsearch output {}

    flush_size => 4096
    idle_flush_time => 6

If I use queue.type: memory then elasticsearch output batches are accumulated properly up to 4096 events or sent after idle_flush_time, whatever happens first.

When I switch to queue.type: persisted then small input batches are just flying through logstash in real-time without being accumulated into larger chunks at all.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.