How to run multiple instances of Filebeat for the same input/output?

Hi,
is there a recommended way to run multiple instances if Filebeat, on different physical servers (VMs) that would process data from the same input and into the same output (ES index)?
Basically to have an option to scale data ingestion by increasing the number of Filebeats instances (kind of like adding consumer instances into the same consumer group in Kafka) ? This would also serve as a failover setup, in case one instance dies for whatever reason.

I saw posts about running different Filebeat instances for different pipelines, but that's not what my goal is.

thank you,
Marina

You could do multiple VMs but it would probably be simpler to just create multiple filebeat processes on the same host, depending on the VM specs. You can also look to increase the number of workers for the inputs/outputs to see if that helps.

It really depends on what is your input and if it will generate duplicates or not.

What input are you using?

Thanks, @leandrojmp ,
I am using the gcp_pubsub input, one topic - and this is exactly what I would like to find out - can multiple Filebeat instances handle sharing/reading separate batches of events from the same Pubsub topic? basically acting as one distributed collector?

thanks you!

In this case, according to the documentation, you can.

Multiple Filebeat instances can be configured to read from the same subscription to achieve high-availability or increased throughput.

thank you !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.