You could try using a sleep filter. By default there is an in-memory queue between the input and the pipeline. If the queue fills up then the input stops reading data.
If you want to limit the rate at which the queue is drained (by each worker thread) to 1000 you could try
sleep { time => "0.001" }
(i.e. sleep for a millisecond per event). The event rate will drift around a number below 1000 per second.
would also work and be more efficient, but that would very much depend on how the validation works (it may do implicit coercion, it may not).
By limiting the rate at which events are removed from the queue you can limit the rate at which events are pulled from the database and added to the queue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.