Question regarding Logstash Horizontal Scaling

krokodil · May 6, 2019, 12:27pm

Hi All,

Hope I can get some clarity on the documentation supplied by Logstash.

We're looking at implementing high availability for Logstash (Elasticsearch clustering is working fine).

As per the Logstash documentation at, https://www.elastic.co/guide/en/logstash/current/deploying-and-scaling.html,

They state:

Logstash is horizontally scalable and can form groups of nodes running the same pipeline

We are aware that you are able to load balance with HAProxy, beats and/or hardware loadbalancing. But that line does not indicate how to get Logstash balancing working - it's very vague.

There is very little supporting documentation regarding this. How is this achieved? From what I've seen, there is no support for HA / clustering in Logstash at this moment in time.

Does anyone perhaps have any input on the above? Any feedback would be greatly appreciated.

nimda · May 6, 2019, 12:43pm

For my understanding there are a coupe of Options to scale Logstash horizontally. One is to use a couple of Logstash instances which feeds data into kafka streams. After this you provides some actually worker Logstash intances which form consumer groups and read the data from the streams and process them.

Even tho this is an option. I would recommend look at elasticsearch pipelines and try to implement it this way. There is a lot less mangement overhead using elasticearch pipelines and HA is automatically included because ES does.

krokodil · May 6, 2019, 12:53pm

Hi nimda,

Appreciate the reply.

I'm looking for an actual explanation on what Logstash meant by that sentence I supplied above.

It seems as if the Elastic Team is saying horizontal scaling is supported (like ES) in the form of a cluster but it doesn't seem so.

If anyone could elaborate on what Elastic actually means by the statement above and how to implement that, it would be greatly appreciated.

Thanks again!

Badger · May 6, 2019, 2:05pm

logstash does not support clustering in the sense that members of the cluster coördinate with oneanother.

If you have more than one logstash instance running the same pipeline, then it does not matter which instance processes any particular event (unless you are using filters like aggregate that require all events to go through a single worker thread -- obviously events cannot go through the same worker thread if they are in different processes). In most cases you can scale capacity by adding more instances. You can gives beats a list of instances tell it to load balance across them. On the logstash side there really isn't much to document.

nimda · May 6, 2019, 2:07pm

I think this means, is that you can deploy multipiple logstash instances with the same config and form this way kind of a cluster.

After this you can insert mutlipile destinations (multipile logstash instances) into your beats config and beats will automatically distribute the collected logs accross the given destionations. Beats will also keep track if one of the destinations is unavailable and send the logs to the available one.

system · June 3, 2019, 2:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash HA Logstash	6	621	January 28, 2021
Horizontally scaling Logstash - Per User Guide Logstash	2	1161	July 6, 2017
Logstash scalability and Input plugins Logstash	3	357	April 1, 2020
How to scale logstash-kafka Logstash	2	1418	July 6, 2017
Can we form a cluster for logstash? Logstash	6	427	September 21, 2018

Question regarding Logstash Horizontal Scaling

Related topics