Filebeat slow to send logs to logstash

Hello all,

Logstash & Filebeat version: 5.5.1
I am new to the ELK stack. I have set up a elasticsearch cluster on k8s. In the begining filebeat was very slow to send logs to logstash ( took hours ) and the logs of filebeat were:

2019/12/13 10:15:07.798513 sync.go:85: ERR Failed to publish events caused by: read tcp 10.1.94.36:41834->10.0.59.193:5044: i/o timeout
2019/12/13 10:15:07.798527 single.go:91: INFO Error publishing events (retrying): read tcp 10.1.94.36:41834->10.0.59.193:5044: i/o timeout

After some research, I applied the following modifications to filebeat and logstash configuration:
logstash config:

congestion_threshold => 180
 client_inactivity_timeout => 1200

Filebeat config :

scan_frequency: 30s
bulk_max_size: 2048

After these changes, more events were able to be sent as the window size grew up to 400 sometimes ( it nevers goes beyond 400 even if the limit is 2048). Before these changes, the window size never grew more than 10.
Also, I noted that filebeat is constantly sending new events ( even if not many logs are generated).is this a normal behavior?

Below are the current configs and logs of filebeat and logstash.

**Filebeat config:**
    apiVersion: v1
data:
  filebeat.yml: |-
    filebeat.prospectors:
    - input_type: log
      paths:
        - /var/log/containers/*.log
      scan_frequency: 30s
      symlinks: true
      json.message_key: log
      json.keys_under_root: true
      json.add_error_key: true
      fields_under_root: true
      fields:
        type: kube-logs
        node.hostname: ${NODE_HOSTNAME}
        pod.ip: ${POD_IP}
      tags:
          - k8s-app
    filebeat.config.modules:
      # Set to true to enable config reloading
      reload.enabled: true
    output.logstash:
      hosts: logstash:5044
      timeout: 15
      bulk_max_size: 2048
      ssl.certificate_authorities: ["/usr/share/elasticsearch/config/tls/ca.crt"]
      ssl.certificate: "/usr/share/elasticsearch/config/tls/filebeat.crt"
      ssl.key: "/usr/share/elasticsearch/config/tls/filebeat.key"
      ssl.key_passphrase: ${APP_KEYSTORE_PASSWORD}
    logging.level: debug
kind: ConfigMap
metadata:
  creationTimestamp: 2019-12-12T13:06:29Z
  labels:
    app: logging-elk-filebeat-ds
    chart: ibm-icplogging-2.2.0
    component: filebeat-ds
    heritage: Tiller
    release: logging
  name: logging-elk-filebeat-ds-config
  namespace: kube-system
  resourceVersion: "91594477"
  selfLink: /api/v1/namespaces/kube-system/configmaps/logging-elk-filebeat-ds-config
  uid: 35ed571f-1ce0-11ea-b481-506b8dfb518b

Filebeat logs:

2019/12/13 10:26:17.025879 output.go:109: DBG  output worker: publish 2048 events
2019/12/13 10:26:17.025891 sync.go:107: DBG  Try to publish 2048 events to logstash with window size 120
2019/12/13 10:26:19.057002 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=1871 libbeat.logstash.publish.write_bytes=27259 libbeat.logstash.published_and_acked_events=247 libbeat.publisher.published_events=2048 publish.events=2048 registrar.states.current=1 registrar.states.update=2048 registrar.writes=1
2019/12/13 10:26:23.399544 prospector.go:183: DBG  Run prospector
2019/12/13 10:26:23.399579 prospector_log.go:70: DBG  Start next scan
2019/12/13 10:26:23.400226 prospector_log.go:226: DBG  Check file for harvesting: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log
2019/12/13 10:26:23.400256 prospector_log.go:259: DBG  Update existing file for harvesting: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log, offset: 140
2019/12/13 10:26:23.400263 prospector_log.go:313: DBG  File didn't change: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log
2019/12/13 10:26:23.400271 prospector_log.go:226: DBG  Check file for harvesting: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log
2019/12/13 10:26:23.400279 prospector_log.go:259: DBG  Update existing file for harvesting: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log, offset: 5605
2019/12/13 10:26:23.400284 prospector_log.go:313: DBG  File didn't change: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log
2019/12/13 10:26:23.400291 prospector_log.go:226: DBG  Check file for harvesting: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log
2019/12/13 10:26:23.400297 prospector_log.go:259: DBG  Update existing file for harvesting: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log, offset: 3708
2019/12/13 10:26:23.400302 prospector_log.go:313: DBG  File didn't change: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log
2019/12/13 10:26:23.400308 prospector_log.go:226: DBG  Check file for harvesting: /var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log
2019/12/13 10:26:23.400320 prospector_log.go:245: DBG  Start harvester for new file: /var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log
2019/12/13 10:26:26.650513 sync.go:78: DBG  120 events out of 2048 events sent to logstash. Continue sending
2019/12/13 10:26:26.650545 sync.go:107: DBG  Try to publish 1928 events to logstash with window size 120
2019/12/13 10:26:31.125702 sync.go:78: DBG  120 events out of 1928 events sent to logstash. Continue sending
2019/12/13 10:26:31.125744 sync.go:107: DBG  Try to publish 1808 events to logstash with window size 120
2019/12/13 10:26:36.691001 sync.go:78: DBG  120 events out of 1808 events sent to logstash. Continue sending
2019/12/13 10:26:36.691041 sync.go:107: DBG  Try to publish 1688 events to logstash with window size 120
2019/12/13 10:26:48.675369 sync.go:78: DBG  120 events out of 1688 events sent to logstash. Continue sending
2019/12/13 10:26:48.675407 sync.go:107: DBG  Try to publish 1568 events to logstash with window size 120
2019/12/13 10:26:49.056948 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=140 libbeat.logstash.publish.write_bytes=33934
2019/12/13 10:26:55.722936 sync.go:78: DBG  120 events out of 1568 events sent to logstash. Continue sending
2019/12/13 10:26:55.722971 sync.go:107: DBG  Try to publish 1448 events to logstash with window size 120
2019/12/13 10:27:00.115661 sync.go:78: DBG  120 events out of 1448 events sent to logstash. Continue sending
2019/12/13 10:27:00.115704 sync.go:107: DBG  Try to publish 1328 events to logstash with window size 120
2019/12/13 10:27:13.896299 sync.go:78: DBG  120 events out of 1328 events sent to logstash. Continue sending
2019/12/13 10:27:13.896345 sync.go:107: DBG  Try to publish 1208 events to logstash with window size 120
2019/12/13 10:27:19.056918 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=105 libbeat.logstash.publish.write_bytes=24952
2019/12/13 10:27:24.090525 sync.go:78: DBG  120 events out of 1208 events sent to logstash. Continue sending
2019/12/13 10:27:24.090558 sync.go:107: DBG  Try to publish 1088 events to logstash with window size 120
2019/12/13 10:27:30.011920 sync.go:78: DBG  120 events out of 1088 events sent to logstash. Continue sending
2019/12/13 10:27:30.011959 sync.go:107: DBG  Try to publish 968 events to logstash with window size 120
2019/12/13 10:27:39.834750 sync.go:78: DBG  120 events out of 968 events sent to logstash. Continue sending
2019/12/13 10:27:39.834781 sync.go:107: DBG  Try to publish 848 events to logstash with window size 120
2019/12/13 10:27:49.060311 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=105 libbeat.logstash.publish.write_bytes=24718
^C2019/12/13 10:27:54.838827 client.go:194: DBG  handle error: read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout
2019/12/13 10:27:54.838872 sync.go:78: DBG  0 events out of 848 events sent to logstash. Continue sending
2019/12/13 10:27:54.838882 sync.go:58: DBG  close connection
2019/12/13 10:27:54.838889 client.go:110: DBG  closing
2019/12/13 10:27:54.839175 sync.go:85: ERR Failed to publish events caused by: read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout
2019/12/13 10:27:54.839203 single.go:91: INFO Error publishing events (retrying): read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout
2019/12/13 10:27:54.839211 sync.go:58: DBG  close connection
2019/12/13 10:27:54.839230 single.go:156: DBG  send fail
2019/12/13 10:27:54.839237 single.go:160: DBG  reset fails
2019/12/13 10:27:55.839410 sync.go:53: DBG  connect
2019/12/13 10:27:55.859760 sync.go:107: DBG  Try to publish 848 events to logstash with window size 60
2019/12/13 10:27:56.677498 window.go:51: DBG  update current window size: 60
2019/12/13 10:27:56.677542 sync.go:78: DBG  60 events out of 848 events sent to logstash. Continue sending
2019/12/13 10:27:56.677556 sync.go:107: DBG  Try to publish 788 events to logstash with window size 90

Logstash config:

apiVersion: v1
data:
  k8s.conf: |-
    input {
      beats {
        port => 5044
        ssl => true
        congestion_threshold => 180
        client_inactivity_timeout => 1200
        ssl_certificate_authorities => ["/usr/share/elasticsearch/config/tls/ca.crt"]
        ssl_certificate => "/usr/share/elasticsearch/config/tls/logstash.crt"
        ssl_key => "/usr/share/elasticsearch/config/tls/logstash.key"
        ssl_key_passphrase => "${APP_KEYSTORE_PASSWORD}"
        ssl_verify_mode => "force_peer"
      }
    }

    filter {
      if [type] == "kube-logs" {
        mutate {
          rename => { "message" => "log" }
          remove_field => ["host"]
        }

        date {
          match => ["time", "ISO8601"]
        }

        # Sample:
        # tiller-1279707952-sgd08_kube-system_tiller-0c51282d195d1c21307f6a48b9118d690441cda498fc5a2790c043407eab235b.log
        # filebeat-j357d_default_filebeat-1a3113e374ad7937a3efa36c4bb42b46b976bcd7cd96223e6b9e6e3df08d802a.log
        # appstdoutpod_default_app-stdout-container-01c2e7a7b105d9141825ea3ae5634b580fdd20a5a4ee890cdbe0816ca002623c.log
        # unified-router-4047118581-sm913_kube-system_unified-router-ddda8a8cbcb74c45b64a4b18997b4f2928c998a37e45037cd0304eaa56d1634f.log
        dissect {
          mapping => {
            "source" => "/var/log/containers/%{kubernetes.pod}_%{kubernetes.namespace}_%{container_file_ext}"
          }
        }

        dissect {
          mapping => {
            "container_file_ext" => "%{container}.%{?file_ext}"
          }
          remove_field => ["host", "container_file_ext"]
        }

        grok {
          "match" => {
            "container" => "^%{DATA:kubernetes.container_name}-(?<kubernetes.container_id>[0-9A-Za-z]{64,64})"
          }
          remove_field => ["container"]
        }
      }
    }

    filter {
      # Drop empty lines
      if [log] =~ /^\s*$/ {
        drop { }
      }
      # Attempt to parse JSON, but ignore failures and pass entries on as-is
      json {
        source => "log"
        skip_on_invalid_json => true
      }
    }

    output {
      elasticsearch {
        hosts => "elasticsearch:9200"
        index => "logstash-%{+YYYY.MM.dd}"
        document_type => "%{[@metadata][type]}"
        ssl => true
        ssl_certificate_verification => true
        keystore => "/usr/share/elasticsearch/config/tls/logstash-elasticsearch-keystore.jks"
        keystore_password => "${APP_KEYSTORE_PASSWORD}"
        truststore => "/usr/share/elasticsearch/config/tls/truststore.jks"
        truststore_password => "${CA_TRUSTSTORE_PASSWORD}"
      }
    }
  logstash.yml: |-
    config.reload.automatic: true
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline
    xpack.monitoring.enabled: false
    log.level: debug
kind: ConfigMap
metadata:
  creationTimestamp: 2019-12-12T13:06:29Z
  labels:
    app: logging-elk-elasticsearch
    chart: ibm-icplogging-2.2.0
    component: logstash
    heritage: Tiller
    release: logging
  name: logging-elk-logstash-config
  namespace: kube-system
  resourceVersion: "91621635"
  selfLink: /api/v1/namespaces/kube-system/configmaps/logging-elk-logstash-config
  uid: 35f2891b-1ce0-11ea-b481-506b8dfb518b

Logstash logs

[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"pod"=>{"ip"=>"10.1.94.36"}, "offset"=>620854, "log"=>"  \"stream\": \"stderr\",", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-12T23:31:59.767Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-12T23:31:59.767359186Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"offset"=>280076, "pod"=>{"ip"=>"10.1.94.36"}, "log"=>"  \"stream\": \"stderr\",", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-13T03:35:33.285Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-13T03:35:33.285527164Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"pod"=>{"ip"=>"10.1.94.36"}, "offset"=>19024, "log"=>"2019-12-13 10:03:39.178 [INFO][76] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}", "kubernetes.pod"=>"calico-node-tgxkl", "kubernetes.container_name"=>"calico-node", "input_type"=>"log", "source"=>"/var/log/containers/calico-node-tgxkl_kube-system_calico-node-2a0252daf4ec37033cede126867eb7b054c33f555683c17a620be3048b0514a4.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"2a0252daf4ec37033cede126867eb7b054c33f555683c17a620be3048b0514a4", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-13T10:03:39.178Z, "stream"=>"stdout", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-13T10:03:39.178887864Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"offset"=>607350, "pod"=>{"ip"=>"10.1.94.36"}, "log"=>"  },", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-12T23:36:02.963Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-12T23:36:02.963324975Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"offset"=>591443, "pod"=>{"ip"=>"10.1.94.36"}, "log"=>"  \"log\": \"  \\\"tags\\\": [\",", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-12T23:44:04.957Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-12T23:44:04.957842018Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"offset"=>1143522, "pod"=>{"ip"=>"10.1.94.36"}, "log"=>"  \"tags\": [", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-12T19:59:22.141Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-12T19:59:22.141379588Z"}}
[2019-12-13T10:30:37,535][DEBUG][logstash.pipeline        ] output received {"event"=>{"pod"=>{"ip"=>"10.1.94.36"}, "offset"=>883471, "log"=>"  \"input_type\": \"log\",", "kubernetes.pod"=>"logging-elk-filebeat-ds-9c6bd", "kubernetes.container_name"=>"filebeat", "input_type"=>"log", "source"=>"/var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log", "type"=>"kube-logs", "tags"=>["k8s-app", "beats_input_raw_event"], "kubernetes.container_id"=>"8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798", "node"=>{"hostname"=>"172.16.7.23"}, "@timestamp"=>2019-12-12T21:12:21.793Z, "stream"=>"stderr", "kubernetes.namespace"=>"kube-system", "beat"=>{"hostname"=>"logging-elk-filebeat-ds-9c6bd", "name"=>"logging-elk-filebeat-ds-9c6bd", "version"=>"5.5.1"}, "@version"=>"1", "time"=>"2019-12-12T21:12:21.793640482Z"}}
[2019-12-13T10:30:38,331][DEBUG][logstash.instrument.periodicpoller.cgroup] Error, cannot retrieve cgroups information {:exception=>"Errno::ENOENT", :message=>"No such file or directory - /sys/fs/cgroup/cpuacct/kubepods/burstable/pod916b36cc-1d0e-11ea-8b5c-506b8df5d4ee/7212fb00a902fb1ef991ea52125a16d98f4c117df6fca7af828589026d083625/cpuacct.usage"}

Update

I changed the path of the input file of filebeat so that I have control over it. It seems to work fine this way.. As only 1 event is queued when i add a new log to the log file. However, when the input comes from logs generated from containers, the logs events are alwaysw 2048. I know for a fact that not that many logs are generated between each scan. Any input on this? Why would logs with different format get processed faster?