I would like to aggregate events until the aggregated event reaches a certain size limit (e.g. 256kb) at which point I would like to push the event to the queue. There are no start/end events.
Push on timeout does not solve my problem as hundreds of events may unpredictably arrive in a very short time interval. Is there a way to do that?
At first I thought not, but there is a distinctly non-scaleable way to do it. In addition to the usual requirement of "--pipeline.workers 1" you need "--pipeline.batch.size 1" so that every event goes through the second aggregate filter before the first aggregate filter processes another event.
This is just a proof-of-concept that demonstrates how it could be done.
The stdin generator is just there to prevent logstash shutting down the pipeline when the generator input finishes. If you remove that when using a generator input you would not get the timeout. For almost any other input it is not needed or useful.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.