Understanding Unusual Logstash Traffic Flow Rates

Hi,
I'd like to understand some apparently odd behaviour we're seeing with traffic hitting our logstashes vs what is being ingested into elasticsearch and s3. Note that we're using the output isolator pattern to decouple the two outputs from one another.

Our test pipeline is:

  • 2 x filebeat
  • 6 x logstash
  • 6 x elasticsearch nodes (general purpose)

What we are seeing is that the logstash instances are collectively receiving/emitting around 40k msg/sec while elasticsearch and s3 are received around 13.5k msg/sec each. We have persistent queues enabled on the logstash nodes and don't see messages backing up there. And we're not seeing rejection/exceptions on the elasticsearch nodes themselves. CPU utilisation on the logstash nodes is between 20-35% and and 25-45% on the elasticsearch nodes. So nothing seems unusual.

Can I ask how logstash will report event rates when using the output isolator patten? I'm struggling to understand the discrepency between what logstash reports it is receiving/emitting vs what is being indexed to elasticsearch and written to s3.

Logstash config looks like this:

- pipeline.id: buffered-input
  queue.type: persisted
  config.string: |
    input {
      beats {
        port => 5044
      }
    }
    output {
      if [application] =~ "test" {
        pipeline {
          send_to => s3
        }
      }
      pipeline { send_to => es }
    }
- pipeline.id: buffered-out-es
  queue.type: persisted
  config.string: |
    input { pipeline { address => es } }
    output {
      elasticsearch {
        hosts => "elasticsearch:9200"
        index => "logs-%{+YYYY.MM.dd}"
        manage_template => false
        retry_max_interval => 16
        timeout => 60
        pipeline => 'test_pipeline'
        pool_max => 5000
      }
    }
- pipeline.id: buffered-out-s3
  queue.type: persisted
  pipeline.workers: 1
  pipeline.batch.size: 10000
  pipeline.batch.delay: 500
  config.string: |
    input { pipeline { address => s3 } }
    output {
      s3 {
        region => "eu-west-1"
        bucket => "test_bucket"
        time_file => 1
        rotation_strategy => "time"
        prefix => "%{application}/%{alias}/%{+YYYY}/%{+MM}/%{+dd}"
        codec => "json_lines"
        canned_acl => "private"
        encoding => "gzip"
        additional_settings => {
          "force_path_style" => true
          "follow_redirects" => false
        }
      }

Thx
D