Elastic-agent stopped working after adding new integration

First of all thank you very much for providing ES Fleet, its so helpful for our small team to manage and monitor nodes easily.

We have 20 nodes with elastic-agent and all are working fine except one. In NODE1 elastic-agent was running fine with system and 2 custom logs integrations. We added a third custom logs integration and now agent shown as healthy but it doesn't collect any integration data (system metrics/ logs/ agents logs/ agent metrics).

PROBLEM NODE
Agent Status show no integrations Even though in fleet we have 4 integrations for the policy

elastic-agent status
Status: HEALTHY
Message: (no message)
Applications: (none)

There are 3 elastic-agent logs files and no defaults folder and no metricbeats/ filebeat logs found

less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185643
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.992+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.998+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.999+0530","log.origin":{"file.name":"application/application.go","file.line":93},"message":"Agent is managed by Fleet","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.000+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.402+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.509+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":291},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:44.510+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:56:44.611+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":304},"message":"failed to ack update open /opt/Elastic/Agent/data/.update-marker: no such file or directory","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:56:45.006+0530","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls_config.go","file.line":98},"message":"SSL/TLS verifications disabled.","ecs.version":"1.6.0"}
less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185416
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.640+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.641+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.642+0530","log.origin":{"file.name":"application/application.go","file.line":93},"message":"Agent is managed by Fleet","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.642+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.984+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.087+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":291},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:17.088+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:54:17.189+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":304},"message":"failed to ack update open /opt/Elastic/Agent/data/.update-marker: no such file or directory","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2021-06-10T18:54:17.791+0530","log.logger":"tls","log.origin":{"file.name":"tlscommon/tls_config.go","file.line":98},"message":"SSL/TLS verifications disabled.","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"cmd/run.go","file.line":189},"message":"Shutting down Elastic Agent and sending last events...","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"application/managed_mode.go","file.line":320},"message":"Agent is stopped","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"cmd/run.go","file.line":197},"message":"Shutting down completed.","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:56:43.770+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":66},"message":"Stats endpoint (/opt/Elastic/Agent/data/tmp/elastic-agent.sock) finished: accept unix /opt/Elastic/Agent/data/tmp/elastic-agent.sock: use of closed network connection","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:56:43.770+0530","log.origin":{"file.name":"fleet/fleet_gateway.go","file.line":205},"message":"Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post \"https://xx.xx.com:8220/api/fleet/agents/2c6dc841-d1c0-4f3d-b436-7b01e856afc2/checkin?\": context canceled","ecs.version":"1.6.0"}
less /opt/Elastic/Agent/data/elastic-agent-2d80f6/logs/elastic-agent-json.log-20210610185412
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.609+0530","log.origin":{"file.name":"warn/warn.go","file.line":18},"message":"The Elastic Agent is currently in BETA and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.610+0530","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.611+0530","log.origin":{"file.name":"application/application.go","file.line":77},"message":"Agent is managed locally","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.614+0530","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /opt/Elastic/Agent/capabilities.yml","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:12.934+0530","log.logger":"composable","log.origin":{"file.name":"composable/controller.go","file.line":46},"message":"EXPERIMENTAL - Inputs with variables are currently experimental and should not be used in production","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.040+0530","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.042+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.043+0530","log.origin":{"file.name":"application/local_mode.go","file.line":168},"message":"Agent is starting","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.044+0530","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /opt/Elastic/Agent/data/tmp/elastic-agent.sock (configured: unix:///opt/Elastic/Agent/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.046+0530","log.origin":{"file.name":"application/local_mode.go","file.line":178},"message":"Agent is stopped","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.046+0530","log.origin":{"file.name":"application/periodic.go","file.line":77},"message":"Configuration changes detected","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.054+0530","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":48},"message":"New State ID is Ej22Ehlz","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.054+0530","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":49},"message":"Converging state requires execution of 2 step(s)","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:13.998+0530","log.origin":{"file.name":"operation/operator.go","file.line":191},"message":"waiting for installer of pipeline 'default' to finish","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.433+0530","log.origin":{"file.name":"process/app.go","file.line":176},"message":"Signaling application to stop because of shutdown: metricbeat--7.13.1","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:54:16.434+0530","log.origin":{"file.name":"log/reporter.go","file.line":36},"message":"2021-06-10T18:54:16+05:30 - message: Application: metricbeat--7.13.1[7fefbe9f-13c3-4d47-96e6-c2b2727ab7a3]: State changed to FAILED: context canceled - type: 'ERROR' - sub_type: 'FAILED'","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2021-06-10T18:54:16.433+0530","log.origin":{"file.name":"status/reporter.go","file.line":236},"message":"Elastic Agent status changed to: 'error'","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2021-06-10T18:54:16.436+0530","log.origin":{"file.name":"log/reporter.go","file.line":40},"message":"2021-06-10T18:54:16+05:30 - message: Application: metricbeat--7.13.1[7fefbe9f-13c3-4d47-96e6-c2b2727ab7a3]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'","ecs.version":"1.6.0"}

We tried uninstalling and reinstalling elastic-agent (via tar file) but still the same issue.

Meanwhile, we tried reinstalling fleet server and now its doesn't seem to run and vanished from the Agents page. Still, all the agents are working and healthy (except the problamatic node)
During fleet-server startup the below error is shown repeatedly

Waiting on active enrollment keys to be created in default policy with Fleet Server integration

FLEET SERVER

elastic-agent status
Status: HEALTHY
Message: (no message)
Applications:
  * fleet-server	(STARTING)
    Waiting on active enrollment keys to be created in default policy with Fleet Server integration

Server info
1 - Ubuntu server 18.04
2 - ES tsack running on 7.13.1 version

1 Like

FIXED.

  1. For the Fleet Server
    The enrollment token was inactive, so created a new enrollment token, created new service token and then added agent again. Now Fleet server agent is shown under Agants and ss healthy.

  2. Problematic Node
    Created a new enrollment token, added agent again with this new token. Now all integration started working.

So its seems the issue was with enrollment tokens.

1 Like

Hi @bravo Really glad you figured out the solution. Thanks for providing it here to everyone in case others stumbled over the same issue.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.