My problem is pretty similar to this one: Logstash to logstash cluster with or without load balancer
I contacted the author, who told me he hadn't found a good solution yet. I'll give a short summary of my problem:
- There are "forwarder" machines in several networks. Many sources from inside their respective networks send them data.
- Those machines then forward the data to either an ES cluster in a completely different network, or to one inside their network. Both cases coexist.
- Depending on the situation, more than one logstash instance might be required in front of those ES clusters - either to handle the load, or for availability purposes.
Most of this is already in place, without logstash.
So my design idea was to have several instances of logstash:
- one on the "forwarder", to have different inputs for different kinds of data, which would then forward to
- one or more for each ES cluster, doing the data operations (extraction, normalisation, etc)
I can't really have all the data operations on the forwarders, as they aren't powerful enough.
So my first question is how to do proper logstash-to-logstash communication? I know the typical answer is the lumberjack input/output plugin, but they haven't been updated in several years, and many problems are reported. I'd rather not move to a deprecated, or unsupported, technology. What's more, the lumberjack output plugin does not support load balancing. An issue is opened since 2016.
Which brings the second question, of course: how to load-balance logstash-to-logstash communication? Is it even possible without external aid?
Thanks in advance for any pointers!