Stability Issues at 10k EPS in Elastic-Agent + Logstash – Elasticsearch Bottleneck?

wangsubo · October 17, 2024, 3:31am

I am currently using Elastic-Agent for log collection and Logstash for log forwarding. I am conducting a stress test to evaluate the hardware requirements and costs of the collector setup (Elastic-Agent + Logstash). I have set the Logstash batch size to 1000.

Apache JMeter (192.168.3.170) -> Elastic-Agent [Fortigate] (192.168.3.172:515) -> Logstash (192.168.3.172:5044) -> Elasticsearch (8 Core/16 GB RAM/512 GB SSD)

input {
  elastic_agent {
    port => 5044
    ssl_enabled => true
    ssl_certificate_authorities => ["/etc/logstash/certs/elasticsearch-ca.pem"]
    ssl_certificate => "/etc/logstash/certs/logstash.crt"
    ssl_key => "/etc/logstash/certs/logstash.pkcs8.key"
    ssl_client_authentication => "required"
  }
}

filter {
  grok {
    match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{IP:syslog_ip} %{GREEDYDATA:message}" }
    overwrite => ["message"]
  }

  
  mutate {
    remove_field => ["syslog_timestamp", "syslog_ip"]
  }

  
  if [message] =~ /type="utm" subtype="ips"/ or [message] =~ /type="event" subtype="system"/ {
    mutate {
      add_tag => ["send_to_QRadar"]
    }
  } else {
    mutate {
      add_tag => ["send_to_elasticsearch"]
    }
  }
}

output {

  
  if "send_to_QRadar" in [tags] {
    tcp {
      host => "192.168.3.180"
      port => 514
      codec => line {
        format => "%{message}"
      }
    }
  }
  if "send_to_elasticsearch" in [tags] {
    elasticsearch {
      hosts => ["https://192.168.3.171:9200"]
      data_stream => "true"
      user => "elastic"
      password => "password"
      cacert => "/etc/logstash/certs/elasticsearch-ca.pem"
    }
  }
}

At 10,000 EPS during the stress test, using the configuration above, the Logstash monitoring curve becomes unstable, holding at approximately 6,000-7,000 EPS.

I suspect the issue might be with Elasticsearch. However, after reviewing the monitoring data, there’s no sign of excessive CPU or RAM usage on Elasticsearch. I also checked the I/O statistics using iostat, and it doesn’t seem to be an I/O issue either.

When I change the output to null, the Logstash monitoring curve stabilizes at around 10,000 EPS.
Apache JMeter (192.168.3.170) -> Elastic-Agent [Fortigate] (192.168.3.172:515) -> Logstash (192.168.3.172:5044) -> Output Null

output {
  null {}
}

Does anyone have insights into what could be causing this problem?

Topic		Replies	Views
Optimizing Logstash Performance: Troubleshooting Instability at 10,000 EPS During Stress Tests Logstash	4	33	October 1, 2024
Optimizing Elastic-Agent: How to Reliably Handle 20,000 EPS and Beyond Elastic Agent	1	36	December 2, 2024
Difference between number of fortigate firewall logs on Logstash and Elastic-agent managed by fleet Elasticsearch docker	5	447	November 3, 2023
How to Fix Unstable Logstash Event Rate at High EPS? Logstash	0	11	September 18, 2024
Elastic Agent output to Kafka Elasticsearch	3	1676	December 14, 2021

Stability Issues at 10k EPS in Elastic-Agent + Logstash – Elasticsearch Bottleneck?

Related topics