TLDR problem is that; enabling persistent queue in logstash will result in undersized elasticsearch output {} batches being produced. It looks like all related batch size settings have no effect at all and all data processes is just sent in the exact same size of batches as provided by beats input {} streams.
Each micro-batch in Logstash is processed and acknowledged separately, so I believe you will never see bulk requests to Elasticsearch larger than the pipeline batch size, which by default is 125. I would recommend increasing the batch size to e.g. 1000 and validate that this results in larger bulk requests to Elasticsearch.
If I use queue.type: memory then elasticsearch output batches are accumulated properly up to 4096 events or sent after idle_flush_time, whatever happens first.
When I switch to queue.type: persisted then small input batches are just flying through logstash in real-time without being accumulated into larger chunks at all.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.