Throttling processing of log backfill

gtorrance · April 26, 2017, 3:56pm

I've done my own research on this, but have not yet found a clear solution. Hoping someone can advise.

We have a very simple single-node ELK server and a client with Filebeat (Filebeat -> LS -> ES). We don't anticipate a high volume (even in Production).

Problem is, when initially starting the server, we want to "backfill" it with a few months worth of logs (say 600+ 1MB daily log files of various types). Filebeat takes off running, loading as many harvesters as it can, and floods the ELK server as if there's no tomorrow. LS seems to keep up OK, but ES gets overwhelmed pretty quickly (returning 429 / "rate limiting" errors constantly during the backfill operation). Though it appears LS will keep trying until successful, I've seen evidence that a lot of messages are getting lots (and never making it into ES).

On the one hand we could attempt to size and configure the server to support this initial flood, but that doesn't seem appropriate (since this is a once-off operation; if it takes a few hours to catch up, no big deal).

How can we safely process a significant "backlog" of files -- once off -- on a modest server, having the various components "throttle" traffic to prevent overwhelming ES (which seems to result in errors and missing documents).

Thoughts?

Thanks,
Greg

nik9000 · April 26, 2017, 4:50pm

I think this might be more of a logstash of filebeat question than an Elasticsearch one. There isn't a whole lot more that Elasticsearch can do. It is already providing the backpressure to logstash/filebeat. Maybe open the question on the logstash forum?

gtorrance · April 26, 2017, 4:54pm

Thanks Nik. That seems reasonable. I'll do that.

Greg

system · May 24, 2017, 5:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Throttling processing of log backfill using Filebeat Beats	11	3498	May 17, 2017
Filebeat traffic overwhelms ELK stack? Elasticsearch	1	730	February 14, 2017
Losing logs in ELK Elasticsearch	3	1791	July 7, 2021
Slow or stalled pipeline Beats filebeat	3	2360	August 19, 2016
Beats>LS>ES vs LS-file>ES (Performance) Beats	1	381	September 12, 2017

Throttling processing of log backfill

Related topics