Logstash Architecture at Scale

Regarding the doc "Deploying and Scaling Logstash", what would be an ideal way(s) to implement a mature Logstash cluster?

The picture at the bottom of the doc shows multiple instances of Shippers, a message queue, and one instance of an Indexer. Should each instance be in its own server? (individual shippers, queue, and indexer) Or should the message queue and indexer be on the same server? Can/Should you have multiple Shippers on the same server?

In my scenario I have a syslog-ng server and a load balancer that gets 50+ GB/day. I know that it will be more than one server can handle, so I'll be needing multiple servers to handle it.

Also, what's a good way to monitor the performance of the Logstash cluster? (events per second, dropped messages in the queue, memory use, etc)

Thanks in advance for your help.


Up to you and your resources really. If you aren't finding a single instance uses all the resources on the box then start another!

You can use the metrics filter, but at the moment we don't have any monitoring visibility into LS. It's coming very soon though!

I get that it's up to me as to how I want to set it up, but I would imagine that there is a preferred/tested way that is better than others. If anyone has setup Logstash in the scenario where LS will be "drinking from a firehose", can you share what worked and what didn't work?

I'd rather not reinvent the wheel trying to optimize LS for a high volume feed.