Hello all,
Logstash & Filebeat version: 5.5.1
I am new to the ELK stack. I have set up a elasticsearch cluster on k8s. In the begining filebeat was very slow to send logs to logstash ( took hours ) and the logs of filebeat were:
2019/12/13 10:15:07.798513 sync.go:85: ERR Failed to publish events caused by: read tcp 10.1.94.36:41834->10.0.59.193:5044: i/o timeout
2019/12/13 10:15:07.798527 single.go:91: INFO Error publishing events (retrying): read tcp 10.1.94.36:41834->10.0.59.193:5044: i/o timeout
After some research, I applied the following modifications to filebeat and logstash configuration:
logstash config:
congestion_threshold => 180 client_inactivity_timeout => 1200
Filebeat config :
scan_frequency: 30s
bulk_max_size: 2048
After these changes, more events were able to be sent as the window size grew up to 400 sometimes ( it nevers goes beyond 400 even if the limit is 2048). Before these changes, the window size never grew more than 10.
Also, I noted that filebeat is constantly sending new events ( even if not many logs are generated).is this a normal behavior?
Below are the current configs and logs of filebeat and logstash.
**Filebeat config:**
apiVersion: v1
data:
filebeat.yml: |-
filebeat.prospectors:
- input_type: log
paths:
- /var/log/containers/*.log
scan_frequency: 30s
symlinks: true
json.message_key: log
json.keys_under_root: true
json.add_error_key: true
fields_under_root: true
fields:
type: kube-logs
node.hostname: ${NODE_HOSTNAME}
pod.ip: ${POD_IP}
tags:
- k8s-app
filebeat.config.modules:
# Set to true to enable config reloading
reload.enabled: true
output.logstash:
hosts: logstash:5044
timeout: 15
bulk_max_size: 2048
ssl.certificate_authorities: ["/usr/share/elasticsearch/config/tls/ca.crt"]
ssl.certificate: "/usr/share/elasticsearch/config/tls/filebeat.crt"
ssl.key: "/usr/share/elasticsearch/config/tls/filebeat.key"
ssl.key_passphrase: ${APP_KEYSTORE_PASSWORD}
logging.level: debug
kind: ConfigMap
metadata:
creationTimestamp: 2019-12-12T13:06:29Z
labels:
app: logging-elk-filebeat-ds
chart: ibm-icplogging-2.2.0
component: filebeat-ds
heritage: Tiller
release: logging
name: logging-elk-filebeat-ds-config
namespace: kube-system
resourceVersion: "91594477"
selfLink: /api/v1/namespaces/kube-system/configmaps/logging-elk-filebeat-ds-config
uid: 35ed571f-1ce0-11ea-b481-506b8dfb518b
Filebeat logs:
2019/12/13 10:26:17.025879 output.go:109: DBG output worker: publish 2048 events 2019/12/13 10:26:17.025891 sync.go:107: DBG Try to publish 2048 events to logstash with window size 120 2019/12/13 10:26:19.057002 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=1871 libbeat.logstash.publish.write_bytes=27259 libbeat.logstash.published_and_acked_events=247 libbeat.publisher.published_events=2048 publish.events=2048 registrar.states.current=1 registrar.states.update=2048 registrar.writes=1 2019/12/13 10:26:23.399544 prospector.go:183: DBG Run prospector 2019/12/13 10:26:23.399579 prospector_log.go:70: DBG Start next scan 2019/12/13 10:26:23.400226 prospector_log.go:226: DBG Check file for harvesting: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log 2019/12/13 10:26:23.400256 prospector_log.go:259: DBG Update existing file for harvesting: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log, offset: 140 2019/12/13 10:26:23.400263 prospector_log.go:313: DBG File didn't change: /var/log/containers/nvidia-device-plugin-bc5x8_kube-system_nvidia-device-plugin-94ad7e80e7f62375df450bfb950f1bc2845dd63851af8e87dee32fc68229b3f6.log 2019/12/13 10:26:23.400271 prospector_log.go:226: DBG Check file for harvesting: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log 2019/12/13 10:26:23.400279 prospector_log.go:259: DBG Update existing file for harvesting: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log, offset: 5605 2019/12/13 10:26:23.400284 prospector_log.go:313: DBG File didn't change: /var/log/containers/monitoring-prometheus-nodeexporter-t5bsq_kube-system_nodeexporter-f14734f3c57c2b266eeb058b4e32dbc78fcfaa63ad04e9deb20634aa60e66e84.log 2019/12/13 10:26:23.400291 prospector_log.go:226: DBG Check file for harvesting: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log 2019/12/13 10:26:23.400297 prospector_log.go:259: DBG Update existing file for harvesting: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log, offset: 3708 2019/12/13 10:26:23.400302 prospector_log.go:313: DBG File didn't change: /var/log/containers/calico-node-tgxkl_kube-system_install-cni-e65fb542a64c3fe9b3d1de82fbe5e5b79c1cc0cb23e1911156a28855a0f8952f.log 2019/12/13 10:26:23.400308 prospector_log.go:226: DBG Check file for harvesting: /var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log 2019/12/13 10:26:23.400320 prospector_log.go:245: DBG Start harvester for new file: /var/log/containers/logging-elk-filebeat-ds-9c6bd_kube-system_filebeat-8e76c31819d6bc2d43e72cc81387294a1b566b3518c6694d542db9e0ed5f2798.log 2019/12/13 10:26:26.650513 sync.go:78: DBG 120 events out of 2048 events sent to logstash. Continue sending 2019/12/13 10:26:26.650545 sync.go:107: DBG Try to publish 1928 events to logstash with window size 120 2019/12/13 10:26:31.125702 sync.go:78: DBG 120 events out of 1928 events sent to logstash. Continue sending 2019/12/13 10:26:31.125744 sync.go:107: DBG Try to publish 1808 events to logstash with window size 120 2019/12/13 10:26:36.691001 sync.go:78: DBG 120 events out of 1808 events sent to logstash. Continue sending 2019/12/13 10:26:36.691041 sync.go:107: DBG Try to publish 1688 events to logstash with window size 120 2019/12/13 10:26:48.675369 sync.go:78: DBG 120 events out of 1688 events sent to logstash. Continue sending 2019/12/13 10:26:48.675407 sync.go:107: DBG Try to publish 1568 events to logstash with window size 120 2019/12/13 10:26:49.056948 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=140 libbeat.logstash.publish.write_bytes=33934 2019/12/13 10:26:55.722936 sync.go:78: DBG 120 events out of 1568 events sent to logstash. Continue sending 2019/12/13 10:26:55.722971 sync.go:107: DBG Try to publish 1448 events to logstash with window size 120 2019/12/13 10:27:00.115661 sync.go:78: DBG 120 events out of 1448 events sent to logstash. Continue sending 2019/12/13 10:27:00.115704 sync.go:107: DBG Try to publish 1328 events to logstash with window size 120 2019/12/13 10:27:13.896299 sync.go:78: DBG 120 events out of 1328 events sent to logstash. Continue sending 2019/12/13 10:27:13.896345 sync.go:107: DBG Try to publish 1208 events to logstash with window size 120 2019/12/13 10:27:19.056918 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=105 libbeat.logstash.publish.write_bytes=24952 2019/12/13 10:27:24.090525 sync.go:78: DBG 120 events out of 1208 events sent to logstash. Continue sending 2019/12/13 10:27:24.090558 sync.go:107: DBG Try to publish 1088 events to logstash with window size 120 2019/12/13 10:27:30.011920 sync.go:78: DBG 120 events out of 1088 events sent to logstash. Continue sending 2019/12/13 10:27:30.011959 sync.go:107: DBG Try to publish 968 events to logstash with window size 120 2019/12/13 10:27:39.834750 sync.go:78: DBG 120 events out of 968 events sent to logstash. Continue sending 2019/12/13 10:27:39.834781 sync.go:107: DBG Try to publish 848 events to logstash with window size 120 2019/12/13 10:27:49.060311 metrics.go:39: INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_bytes=105 libbeat.logstash.publish.write_bytes=24718 ^C2019/12/13 10:27:54.838827 client.go:194: DBG handle error: read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout 2019/12/13 10:27:54.838872 sync.go:78: DBG 0 events out of 848 events sent to logstash. Continue sending 2019/12/13 10:27:54.838882 sync.go:58: DBG close connection 2019/12/13 10:27:54.838889 client.go:110: DBG closing 2019/12/13 10:27:54.839175 sync.go:85: ERR Failed to publish events caused by: read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout 2019/12/13 10:27:54.839203 single.go:91: INFO Error publishing events (retrying): read tcp 10.1.94.36:42626->10.0.59.193:5044: i/o timeout 2019/12/13 10:27:54.839211 sync.go:58: DBG close connection 2019/12/13 10:27:54.839230 single.go:156: DBG send fail 2019/12/13 10:27:54.839237 single.go:160: DBG reset fails 2019/12/13 10:27:55.839410 sync.go:53: DBG connect 2019/12/13 10:27:55.859760 sync.go:107: DBG Try to publish 848 events to logstash with window size 60 2019/12/13 10:27:56.677498 window.go:51: DBG update current window size: 60 2019/12/13 10:27:56.677542 sync.go:78: DBG 60 events out of 848 events sent to logstash. Continue sending 2019/12/13 10:27:56.677556 sync.go:107: DBG Try to publish 788 events to logstash with window size 90