Hi @exekias and @kvch,
We also hit this with one of our heartbeat loggers that eventually produced an alert. Seems like after filebeat hits this bug, it basically stops harvesting pod logs.
The heartbeat logger is basically a simple kubernetes Deployment running
while true; do echo "{\"message\": \"Kubernetes checking in\", \"logger_type\": \"stdout\"}"; sleep 60; done
I have turned on debug logging now as you advised above, but I have wait for the bug to be triggered again to have those logs. In the meantime I dug around in kibana for filebeat logs, and found this excerpt (copied from kibana UI, so the order is reversed)
May 11th 2018, 11:01:41.358 INFO Prospector ticker stopped
May 11th 2018, 11:01:41.358 INFO Harvester started for file: /var/lib/docker/containers/c098839864276336ab59ffc1b44c1d1e8f4145d86a90b11e17d49a4331342641/c098839864276336ab59ffc1b44c1d1e8f4145d86a90b11e17d49a4331342641-json.log
May 11th 2018, 11:01:41.357 INFO Configured paths: [/var/lib/docker/containers/c098839864276336ab59ffc1b44c1d1e8f4145d86a90b11e17d49a4331342641/*.log]
May 11th 2018, 11:01:41.357 INFO Stopping Prospector: 8982987255257196772
May 11th 2018, 11:01:41.357 INFO Autodiscover starting runner: prospector [type=docker, ID=8755765163011943473]
May 11th 2018, 11:01:41.357 INFO Prospector ticker stopped
May 11th 2018, 11:01:41.357 WARN EXPERIMENTAL: Docker prospector is enabled.
May 11th 2018, 11:01:41.357 INFO Autodiscover stopping runner: prospector [type=docker, ID=8982987255257196772]
May 11th 2018, 11:01:41.356 INFO Configured paths: [/var/lib/docker/containers/109f3ef4093b8e32744f7da5d034e32aa41ab3375b67c83e58ba0aa87b6bef51/*.log]
May 11th 2018, 11:01:41.356 WARN EXPERIMENTAL: Docker prospector is enabled.
May 11th 2018, 11:01:41.356 INFO Autodiscover starting runner: prospector [type=docker, ID=16898863723061103725]
May 11th 2018, 11:01:41.350 INFO Stopping Prospector: 16898863723061103725
May 11th 2018, 11:01:41.350 WARN EXPERIMENTAL: Docker prospector is enabled.
May 11th 2018, 11:01:41.350 INFO Autodiscover stopping runner: prospector [type=docker, ID=16898863723061103725]
May 11th 2018, 11:01:41.350 INFO Prospector ticker stopped
May 11th 2018, 11:01:41.346 ERROR kubernetes: Watching API error proto: wrong wireType = 6 for field ServiceAccountName
May 11th 2018, 11:01:41.346 INFO kubernetes: Ignoring event, moving to most recent resource version
May 11th 2018, 11:01:41.346 INFO kubernetes: Watching API for pod events
May 11th 2018, 11:01:41.344 ERROR kubernetes: Watching API error EOF
May 11th 2018, 11:01:41.344 INFO kubernetes: Watching API for pod events
May 11th 2018, 11:00:09.649 INFO File is inactive: /var/lib/host/logs/auth.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:55:04.630 INFO Harvester started for file: /var/lib/host/logs/auth.log
May 11th 2018, 10:50:09.624 INFO File is inactive: /var/lib/host/logs/auth.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:45:04.610 INFO Harvester started for file: /var/lib/host/logs/auth.log
May 11th 2018, 10:40:09.607 INFO File is inactive: /var/lib/host/logs/auth.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:35:04.586 INFO Harvester started for file: /var/lib/host/logs/auth.log
May 11th 2018, 10:30:09.567 INFO File is inactive: /var/lib/host/logs/auth.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:25:04.557 INFO Harvester started for file: /var/lib/host/logs/auth.log
May 11th 2018, 10:22:14.546 INFO File is inactive: /var/lib/host/logs/auth.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:17:29.095 INFO File is inactive: /var/lib/docker/containers/286317a149ad1220a2dd99ece95807181c32736fac80151c27b49a6a4458e844/286317a149ad1220a2dd99ece95807181c32736fac80151c27b49a6a4458e844-json.log. Closing because close_inactive of 5m0s reached.
May 11th 2018, 10:15:04.536 INFO Harvester started for file: /var/lib/host/logs/auth.log
These lines stand out to me the most
May 11th 2018, 11:01:41.346 ERROR kubernetes: Watching API error proto: wrong wireType = 6 for field ServiceAccountName
May 11th 2018, 11:01:41.346 INFO kubernetes: Ignoring event, moving to most recent resource version
May 11th 2018, 11:01:41.346 INFO kubernetes: Watching API for pod events
May 11th 2018, 11:01:41.344 ERROR kubernetes: Watching API error EOF
Previously we encountered the problem with the infinite loop which was fixed in the PR https://github.com/elastic/beats/pull/6504, the same one that introduced the above log line Ignoring event, moving to most recent resource version
. I hope this helps a bit