I've done my own research on this, but have not yet found a clear solution. Hoping someone can advise.
We have a very simple single-node ELK server and a client with Filebeat (Filebeat -> LS -> ES). We don't anticipate a high volume (even in Production).
Problem is, when initially starting the server, we want to "backfill" it with a few months worth of logs (say 600+ 1MB daily log files of various types). Filebeat takes off running, loading as many harvesters as it can, and floods the ELK server as if there's no tomorrow. LS seems to keep up OK, but ES gets overwhelmed pretty quickly (returning 429 / "rate limiting" errors constantly during the backfill operation). Though it appears LS will keep trying until successful, I've seen evidence that a lot of messages are getting lots (and never making it into ES).
On the one hand we could attempt to size and configure the server to support this initial flood, but that doesn't seem appropriate (since this is a once-off operation; if it takes a few hours to catch up, no big deal).
How can we safely process a significant "backlog" of files -- once off -- on a modest server, having the various components "throttle" traffic to prevent overwhelming ES (which seems to result in errors and missing documents).
Thoughts?
Thanks,
Greg