How logstash memory behaves with lot of beats connections?

Hi all,

We got some little issues in production around Logstash and we are a bit worried for the future, we would like to know the community's opinion on this. Let me explain:

We have ran Logstash in production so far as a docker container with default JVM options (1GB heap). We started recently to use Winlogbeat directly on Windows endpoints (~ 80 endpoints to begin) to gather windows events in ECS format. Our winlogbeat configuration output an average of 230 events per second for all the endpoints. Our logstash instance is doing much more and have to handle an average of 1000 events per second.

Everything ran smooth until we started to listen for beats connections in logstash. Indeed, it ran out of memory. Increasing the java heap space to 8GB fixes the issue once for all. The thing is, we are going to continue the deployment of Winlogbeat to 12.000 endpoints. With our latest issue we had in production, we are a bit concerned about the memory that would require. We are a bit scared if Logstash would be able to sustain so many connections simultaneously according to the maximum memory we can provide: 32GB.

My first feelings would have said the memory required will be about the internal queues containing the events. But, I would be interested to know what could be the memory overhead of using so many beats connections in Logstash, or this can be considered out of concerns but the events filling-in internal queues !? Note that:

  • Using a single Winlogbeat instance on a Windows Events Collector is unfortunately not an option :frowning:
  • We run a reverse proxy with mutual TLS in front of Logstash

What you guys think about it?

Thx

    input {
      beats {
        port => 5044
      }
    }

    filter {
      ruby {
        path => "/usr/share/logstash/logstash_ruby_scripts/endpoint_meta.rb"
        script_params => {
          api_key_field => "[meta][key]"
           master_url => "http://api-center/endpoint/api_key/validate"
          ca_path => "/etc/ssl/certs/ca-certificates.crt"
          cache_duration => 300
          failed_cache_duration => 60
        }
      }
      ruby {
        code => "event.set('[@metadata][ingest_date]', Time.now.strftime('%Y-%m-%d_%H_%M'))"
      }
      mutate {
        copy => { "[meta][deployment_name]" => "[@metadata][filename]" }
      }
      mutate {
        gsub => [
          "[@metadata][filename]", "[^A-Za-z0-9]", ""
        ]
      }
    }

    output {
      file {
        path => "/data/host_logs/winlogbeat/%{[@metadata][filename]}_windows-%{[@metadata][ingest_date]}.json"
      }
    }

Is it the number of winlogbeats endpoint or just the tremendous volume of logs ?

That's more or less my question :smiley: I don't know if it's safe to run with 12K connections and if the log volume only matters. If a java queue is created per connection, it could be (I guess) also an attention point. I don't know about the implementation actually.

I'm pretty sure that inputs do matter for logstashs memory usage.
Is it not an option to scale your logstash horizontally?

If you're going to add 12k more hosts you should also think about the volatility of your log volume. If your logstash is already scratching its max RAM capacity any burst of events might potentially overload it?

Initialy I was proposing you to add a intermediate server using filebeat to limit the endpoint connection, that's what i have setup in my architecture.
You could try to do this and see how it goes with the heap usage it might give you answers about where is the bottleneck.

you could try something like :
[wingogbeats] -(12 K endpoints)-> [filebeat server(s)] --> [logstash]

Is it not an option to scale your logstash horizontally?

Since Kafka is deprecated is think the best might be to hold and forward with filebeats.

What's your idea of the things ?

I'm not aware Kafka is deprecated :face_with_monocle:

Anywa, we are seriously considering to scale out Logstash horizontally and have different set of endpoints sending to different Logstash instance. It's not ideal but at least, it fit in our limitations.

Many thanks for your replies ! :smiley:

I'm not aware Kafka is deprecated :face_with_monocle:

Just googled it and it's not, dunnoe where i've heard this from sorry for the miss information !

I'm actually using Kafka together with Zookeeper in my own cluster - I think if Kafka would be deprecated for good I'd switch over to RabbitMQ or Redis

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.