First of all thank you very much for providing ES Fleet, its so helpful for our small team to manage and monitor nodes easily.
We have 20 nodes with elastic-agent and all are working fine except one. In NODE1 elastic-agent was running fine with system and 2 custom logs integrations. We added a third custom logs integration and now agent shown as healthy but it doesn't collect any integration data (system metrics/ logs/ agents logs/ agent metrics).
PROBLEM NODE
Agent Status show no integrations Even though in fleet we have 4 integrations for the policy
elastic-agent status
Status: HEALTHY
Message: (no message)
Applications: (none)
There are 3 elastic-agent logs files and no defaults folder and no metricbeats/ filebeat logs found
less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185643
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.992+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.998+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.999+0530","log.origin":{"file.name":"application/application.go","file.line":93},"message":"Agent is managed by Fleet","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.000+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.402+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.509+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":291},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:56:44.611+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":304},"message":"failed to ack update open /opt/Elastic/Agent/data/.update-marker: no such file or directory","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:56:45.006+0530","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls_config.go","file.line":98},"message":"SSL/TLS verifications disabled.","ecs.version":"1.6.0"}
less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185416
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.640+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.641+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.642+0530","log.origin":{"file.name":"application/application.go","file.line":93},"message":"Agent is managed by Fleet","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.642+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.984+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.087+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":291},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:54:17.189+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":304},"message":"failed to ack update open /opt/Elastic/Agent/data/.update-marker: no such file or directory","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:54:17.791+0530","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls_config.go","file.line":98},"message":"SSL/TLS verifications disabled.","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"cmd/run.go","file.line":189},"message":"Shutting down Elastic Agent and sending last events...","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":320},"message":"Agent is stopped","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"cmd/run.go","file.line":197},"message":"Shutting down completed.","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":66},"message":"Stats endpoint (/opt/Elastic/Agent/data/tmp/elastic-agent.sock) finished: accept unix /opt/Elastic/Agent/data/tmp/elastic-agent.sock: use of closed network connection","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"fleet/fleet_gateway.go","file.line":205},"message":"Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post \"https://xx.xx.com:8220/api/fleet/agents/2c6dc841-d1c0-4f3d-b436-7b01e856afc2/checkin?\": context canceled","ecs.version":"1.6.0"}
less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185412
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.609+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.610+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.611+0530","log.origin":{"file.name":"application/application.go","file.line":77},"message":"Agent is managed locally","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.614+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.934+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.040+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.042+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.043+0530","log.origin":{"file.name":"application/local_mode.go","file.line":168},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.044+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.046+0530","log.origin":{"file.name":"application/local_mode.go","file.line":178},"message":"Agent is stopped","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.046+0530","log.origin":{"file.name":"application/periodic.go","file.line":77},"message":"Configuration changes detected","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.054+0530","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":48},"message":"New State ID is Ej22Ehlz","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.054+0530","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":49},"message":"Converging state requires execution of 2 step(s)","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.998+0530","log.origin":{"file.name":"operation/operator.go","file.line":191},"message":"waiting for installer of pipeline 'default' to finish","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.433+0530","log.origin":{"file.name":"process/app.go","file.line":176},"message":"Signaling application to stop because of shutdown: metricbeat--7.13.1","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:54:16.434+0530","log.origin":{"file.name":"log/reporter.go","file.line":36},"message":"2021-06-10T18:54:16+05:30 - message: Application: metricbeat--7.13.1[7fefbe9f-13c3-4d47-96e6-c2b2727ab7a3]: State changed to FAILED: context canceled - type: 'ERROR' - sub_type: 'FAILED'","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:54:16.433+0530","log.origin":{"file.name":"status/reporter.go","file.line":236},"message":"Elastic Agent status changed to: 'error'","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.436+0530","log.origin":{"file.name":"log/reporter.go","file.line":40},"message":"2021-06-10T18:54:16+05:30 - message: Application: metricbeat--7.13.1[7fefbe9f-13c3-4d47-96e6-c2b2727ab7a3]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'","ecs.version":"1.6.0"}
We tried uninstalling and reinstalling elastic-agent (via tar file) but still the same issue.
Meanwhile, we tried reinstalling fleet server and now its doesn't seem to run and vanished from the Agents page. Still, all the agents are working and healthy (except the problamatic node)
During fleet-server startup the below error is shown repeatedly
Waiting on active enrollment keys to be created in default policy with Fleet Server integration
FLEET SERVER
elastic-agent status
Status: HEALTHY
Message: (no message)
Applications:
* fleet-server (STARTING)
Waiting on active enrollment keys to be created in default policy with Fleet Server integration
Server info
1 - Ubuntu server 18.04
2 - ES tsack running on 7.13.1 version