My Elastic Agent that runs the Fleet policy stopped working a week or two ago. As it cannot start the Fleet server.
# elastic-agent status
Status: FAILED
Message: (no message)
Applications:
* filebeat (CONFIGURING)
Updating configuration
* fleet-server (FAILED)
Missed two check-ins
* metricbeat (HEALTHY)
Running
* filebeat_monitoring (CONFIGURING)
Updating configuration
* metricbeat_monitoring (HEALTHY)
Running
The logs in /opt/Elastic/Agent
show a lot of connection refused errors.
2022-06-20T14:26:02.419-0700 WARN status/reporter.go:236 Elastic Agent status changed to: 'degraded'
2022-06-20T14:26:02.419-0700 INFO log/reporter.go:40 2022-06-20T14:26:02-07:00 - message: Application: fleet-server--7.17.3[eb389cc3-2383-46ba-996d-70409ed1f68f]: State changed to DEGRADED: Missed last check-in - type: 'STATE' - sub_type: 'RUNNING'
2022-06-20T14:26:02.989-0700 ERROR fleet/fleet_gateway.go:205 Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post "http://localhost:8220/api/fleet/agents/eb389cc3-2383-46ba-996d-70409ed1f68f/checkin?": dial tcp 127.0.0.1:8220: connect: connection refused
2022-06-20T14:27:02.427-0700 ERROR status/reporter.go:236 Elastic Agent status changed to: 'error'
2022-06-20T14:27:02.428-0700 ERROR log/reporter.go:36 2022-06-20T14:27:02-07:00 - message: Application: fleet-server--7.17.3[eb389cc3-2383-46ba-996d-70409ed1f68f]: State changed to FAILED: Missed two check-ins - type: 'ERROR' - sub_type: 'FAILED'
2022-06-20T14:28:24.746-0700 ERROR fleet/fleet_gateway.go:205 Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post "http://localhost:8220/api/fleet/agents/eb389cc3-2383-46ba-996d-70409ed1f68f/checkin?": dial tcp 127.0.0.1:8220: connect: connection refused
2022-06-20T14:33:14.745-0700 ERROR fleet/fleet_gateway.go:205 Could not communicate with fleet-server Checking API will retry, error: fail to checkin to fleet-server: Post "http://localhost:8220/api/fleet/agents/eb389cc3-2383-46ba-996d-70409ed1f68f/checkin?": dial tcp 127.0.0.1:8220: connect: connection refused
I eventually dug my way into /opt/Elastic/Agent/data/elastic-agent-1993ee/logs/default
and found the fleet-server-json.log
.
{"log.level":"info","service.name":"fleet-server","version":"7.17.3","commit":"298a11f","pid":6853,"ppid":6769,"exe":"/opt/Elastic/Agent/data/elastic-agent-1993ee/install/fleet-server-7.17.3-linux-x86_64/fleet-server","args":["--agent-mode","-E","logging.level=info","-E","http.enabled=true","-E","http.host=unix:///opt/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock","-E","logging.json=true","-E","logging.ecs=true","-E","logging.files.path=/opt/Elastic/Agent/data/elastic-agent-1993ee/logs/default","-E","logging.files.name=fleet-server-json.log","-E","logging.files.keepfiles=7","-E","logging.files.permission=0640","-E","logging.files.interval=1h","-E","path.data=/opt/Elastic/Agent/data/elastic-agent-1993ee/run/default/fleet-server--7.17.3"],"@timestamp":"2022-06-20T21:24:59.625Z","message":"Boot fleet-server"}
{"log.level":"info","service.name":"fleet-server","@timestamp":"2022-06-20T21:24:59.626Z","message":"starting communication connection back to Elastic Agent"}
{"log.level":"info","service.name":"fleet-server","@timestamp":"2022-06-20T21:24:59.626Z","message":"waiting for Elastic Agent to send initial configuration"}
{"log.level":"error","service.name":"fleet-server","error.message":"only 1 fleet-server input can be defined accessing config","@timestamp":"2022-06-20T21:25:00.148Z","message":"Exiting"}
That last line about 1 fleet-server input isn't about there being more than on instance of fleet running, right? I've only ever has the 1 instance, and I just reinstalled Elastic Agent and rebooted the server, so there shouldn't be another instance hanging around.
One thought does occur, when I was messing with the policy in Kibana, I got an error when I "upgraded" the fleet policy. It complained about the name of it already existing. I just stuck a b
on the end of the same to rename it and it upgraded just fine.
Any help would be appreciated.
Thanks!