Starting Point to Tune Logstash

It's not easy to get details from your posts. To summarize:

  • 23 pipelines + 3 under heavy loads on 16 cores&32 GB memory

  • ~10ms is event processing time for heavy loads

  • everything is on 1 LS host

  • pipeline.workers and pipeline.batch.size are default?

pipeline.workers # LS will use max 16
pipeline.batch.size: 125
pipeline.batch.delay: 5
pipeline.ordered: auto
  • What are your values Xms and Xmx in jvm.options?
  • Are all settings the same for pipelines? What is specific for those 3 pipelines?
  • Are you using memory or persistent queue?
  • How many data ES nodes are using?
  • Have you checked ES logs? Especially because slow insert and DLQ.
  • What is avg/max the message in those h loaded pipelines?

Since you said "my configuration was simple", there is not much code in filters, you can try with:
Edit: make backup before any changes

pipeline.batch.size: 250 # only in heavy loads
compression_level => 5 # or increase to reduce load
ssl_enabled => true # yes, should be by default
pool_max => 2000 # increase to reduce reopening
pool_max_per_route => 200
ssl_supported_protocols  => "TLSv1.3" # use only 1.3, should be faster to establish sec channel
resurrect_delay => 2
  • Exclude dedicated master nodes from list
  • Check ES logs(all nodes), why are you getting dlq_routed, you can do it manually or metricbeat or agent
  • Use sniffing mode, check this thread.
  • Investigate LS statistics for all pipelines
  • Check value for tcp_keepalive_time, only check, do not touch on OS level.
  • Allocate 2-3 nodes only to heavy loaded pipelines, other pipelines should

This is not simple optimization activity since it's on live data&load, where Jedi council don't have full information or access. I truly hope other Jedi will give own opinion.

Have you used live pipelines monitoring in Kibana? If not already exists, should set it

PUT _cluster/settings
{
  "persistent": {
    "xpack.monitoring.collection.enabled": true
  }
}