I recently upgraded my Elastic Cloud instance to 8.6.1. After upgrading, I triggered an agent upgrade in Fleet to v8.6.1. The agents DID NOT upgrade and were stuck in Updating status for almost a week. If I go into the server and manually trigger an upgrade, the upgrade occur on the server, fleet will show the new version, but the Agent Status in Fleet will still show as updating. The command I use to do the upgrade in my Linux environment is:
I found another post from December which reported a similar behavior, but I was unclear from the answers there whether following those instructions would remove the updating status from my agents, update Fleet instead (which is updated), or unenroll my agents.
So are agents actually upgraded just stuck in Updating state? It would help to check the Agent logs to see why it hasn't gone back to Healthy.
I think the linked commands should work to move back agents to Healthy manually, I don't see why agents would be unenrolled.
No. The agents never updated. I can get them to update by going in and manually updating from the host (and the upgraded version shows in Fleet), but it still shows as updating in Fleet.
It would be best to check why the Upgrade action from Kibana doesn't work, you can gather the agent logs with elastic-agent diagnostics collect command.
Same issue here. Elastic-Stack 8.6.2. Upgraded from 8.6.1 to 8.6.2
We have two Fleet-Server
One went fine. The other one is stuck on Updating with several other Agents.
Log of Elastic-Fleet Server not beeing able to Upgrade and stuck in the upgrading state. Tried to restart the elastic agent service, still stuck
14:11:13.647[elastic_agent][info] signal "terminated" received
14:11:13.647[elastic_agent][info] Shutting down Elastic Agent and sending last events...
14:11:13.647[elastic_agent][info] signal "terminated" received
14:11:13.647[elastic_agent][info] Shutting down Elastic Agent and sending last events...
14:11:13.648[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:11:13.648[elastic_agent][error] checkin retry loop was stopped
14:11:13.648[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:11:13.648[elastic_agent][error] checkin retry loop was stopped
14:11:13.748[elastic_agent][info] Shutting down completed.
14:11:13.748[elastic_agent][info] Stats endpoint ([::]:6791) finished: accept tcp [::]:6791: use of closed network connection
14:11:13.748[elastic_agent][info] Shutting down completed.
14:11:13.748[elastic_agent][info] Stats endpoint ([::]:6791) finished: accept tcp [::]:6791: use of closed network connection
14:11:14.047[elastic_agent][info] APM instrumentation disabled
14:11:14.050[elastic_agent][info] Gathered system information
14:11:14.059[elastic_agent][info] Detected available inputs and outputs
14:11:14.059[elastic_agent][info] Capabilities file not found in /opt/Elastic/Agent/capabilities.yml
14:11:14.059[elastic_agent][info] Determined allowed capabilities
14:11:14.185[elastic_agent][info] Parsed configuration and determined agent is managed by Fleet
14:11:14.197[elastic_agent][info] Starting stats endpoint
14:11:14.197[elastic_agent][info] Metrics endpoint listening on: [::]:6791 (configured: http://:6791)
14:11:14.199[elastic_agent][info] Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
14:11:14.810[elastic_agent][info] restoring current policy from disk
14:11:14.819[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "http://nexus.x.local/repository/proxy-raw-elasticagent/"
14:11:14.822[elastic_agent][info] Updating running component model
14:11:14.887[elastic_agent][info] Spawned new component system/metrics-default: Starting: spawned pid '22133'
14:11:14.887[elastic_agent][info] Spawned new unit system/metrics-default-system/metrics-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1: Starting: spawned pid '22133'
14:11:14.887[elastic_agent][info] Spawned new unit system/metrics-default: Starting: spawned pid '22133'
14:11:15.042[elastic_agent][info] Spawned new component fleet-server-default: Starting: spawned pid '22170'
14:11:15.042[elastic_agent][info] Spawned new unit fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e: Starting: spawned pid '22170'
14:11:15.042[elastic_agent][info] Spawned new unit fleet-server-default: Starting: spawned pid '22170'
14:11:15.238[elastic_agent][info] Spawned new component log-default: Starting: spawned pid '22184'
14:11:15.238[elastic_agent][info] Spawned new unit log-default-logfile-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1: Starting: spawned pid '22184'
14:11:15.238[elastic_agent][info] Spawned new unit log-default: Starting: spawned pid '22184'
14:11:15.488[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '22225'
14:11:15.488[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '22225'
14:11:15.488[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '22225'
14:11:15.668[elastic_agent][info] Spawned new component beat/metrics-monitoring: Starting: spawned pid '22262'
14:11:15.668[elastic_agent][info] Spawned new unit beat/metrics-monitoring-metrics-monitoring-beats: Starting: spawned pid '22262'
14:11:15.668[elastic_agent][info] Spawned new unit beat/metrics-monitoring: Starting: spawned pid '22262'
14:11:15.921[elastic_agent][info] Spawned new component http/metrics-monitoring: Starting: spawned pid '22293'
14:11:15.922[elastic_agent][info] Spawned new unit http/metrics-monitoring: Starting: spawned pid '22293'
14:11:15.922[elastic_agent][info] Spawned new unit http/metrics-monitoring-metrics-monitoring-agent: Starting: spawned pid '22293'
14:11:15.922[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: [action_id: policy:fleet-server-policy:14:1, type: POLICY_CHANGE]
14:11:15.925[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22225'
14:11:15.925[elastic_agent][info] Component state changed fleet-server-default (STARTING->HEALTHY): Healthy: communicating with pid '22170'
14:11:15.925[elastic_agent][info] Component state changed system/metrics-default (STARTING->HEALTHY): Healthy: communicating with pid '22133'
14:11:15.925[elastic_agent][info] Component state changed log-default (STARTING->HEALTHY): Healthy: communicating with pid '22184'
14:11:15.925[elastic_agent][info] Component state changed beat/metrics-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22262'
14:11:16.029[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
14:11:16.029[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
14:11:16.031[elastic_agent][info] Unit state changed log-default-logfile-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1 (STARTING->HEALTHY): Healthy
14:11:16.031[elastic_agent][info] Unit state changed log-default (STARTING->HEALTHY): Healthy
14:11:16.041[elastic_agent][info] Unit state changed beat/metrics-monitoring-metrics-monitoring-beats (STARTING->HEALTHY): Healthy
14:11:16.041[elastic_agent][info] Unit state changed beat/metrics-monitoring (STARTING->HEALTHY): Healthy
14:11:16.054[elastic_agent][info] Unit state changed system/metrics-default-system/metrics-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1 (STARTING->HEALTHY): Healthy
14:11:16.054[elastic_agent][info] Unit state changed system/metrics-default (STARTING->HEALTHY): Healthy
14:11:16.062[elastic_agent][info] Unit state changed fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e (STARTING->CONFIGURING): Re-configuring
14:11:16.063[elastic_agent][info] Unit state changed fleet-server-default (STARTING->CONFIGURING): Re-configuring
14:11:16.141[elastic_agent][info] Component state changed http/metrics-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22293'
14:11:16.232[elastic_agent][info] Unit state changed fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e (CONFIGURING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:11:16.232[elastic_agent][info] Unit state changed fleet-server-default (CONFIGURING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:11:16.232[elastic_agent][info] Fleet gateway started
14:11:16.251[elastic_agent][info] Unit state changed http/metrics-monitoring-metrics-monitoring-agent (STARTING->HEALTHY): Healthy
14:11:16.251[elastic_agent][info] Unit state changed http/metrics-monitoring (STARTING->HEALTHY): Healthy
14:20:02.046[elastic_agent][info] signal "terminated" received
14:20:02.046[elastic_agent][info] Shutting down Elastic Agent and sending last events...
14:20:02.046[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:20:02.046[elastic_agent][error] checkin retry loop was stopped
14:20:02.146[elastic_agent][info] Shutting down completed.
14:20:17.587[elastic_agent][info] APM instrumentation disabled
14:20:17.589[elastic_agent][info] Gathered system information
14:20:17.597[elastic_agent][info] Detected available inputs and outputs
14:20:17.597[elastic_agent][info] Capabilities file not found in /opt/Elastic/Agent/capabilities.yml
14:20:17.597[elastic_agent][info] Determined allowed capabilities
14:20:17.655[elastic_agent][info] Parsed configuration and determined agent is managed by Fleet
14:20:17.668[elastic_agent][info] Starting stats endpoint
14:20:17.668[elastic_agent][info] Metrics endpoint listening on: [::]:6791 (configured: http://:6791)
14:20:17.669[elastic_agent][info] Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
14:20:18.280[elastic_agent][info] restoring current policy from disk
14:20:18.290[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "http://nexus.x.local/repository/proxy-raw-elasticagent/"
14:20:18.292[elastic_agent][info] Updating running component model
14:20:18.411[elastic_agent][info] Spawned new component fleet-server-default: Starting: spawned pid '22624'
14:20:18.411[elastic_agent][info] Spawned new unit fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e: Starting: spawned pid '22624'
14:20:18.411[elastic_agent][info] Spawned new unit fleet-server-default: Starting: spawned pid '22624'
14:20:18.510[elastic_agent][info] Spawned new component log-default: Starting: spawned pid '22639'
14:20:18.510[elastic_agent][info] Spawned new unit log-default-logfile-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1: Starting: spawned pid '22639'
14:20:18.510[elastic_agent][info] Spawned new unit log-default: Starting: spawned pid '22639'
14:20:18.661[elastic_agent][info] Spawned new component system/metrics-default: Starting: spawned pid '22678'
14:20:18.661[elastic_agent][info] Spawned new unit system/metrics-default-system/metrics-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1: Starting: spawned pid '22678'
14:20:18.661[elastic_agent][info] Spawned new unit system/metrics-default: Starting: spawned pid '22678'
14:20:18.705[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '22685'
14:20:18.705[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '22685'
14:20:18.705[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '22685'
14:20:18.821[elastic_agent][info] Spawned new component beat/metrics-monitoring: Starting: spawned pid '22743'
14:20:18.821[elastic_agent][info] Spawned new unit beat/metrics-monitoring-metrics-monitoring-beats: Starting: spawned pid '22743'
14:20:18.821[elastic_agent][info] Spawned new unit beat/metrics-monitoring: Starting: spawned pid '22743'
14:20:18.900[elastic_agent][info] Spawned new component http/metrics-monitoring: Starting: spawned pid '22771'
14:20:18.901[elastic_agent][info] Spawned new unit http/metrics-monitoring-metrics-monitoring-agent: Starting: spawned pid '22771'
14:20:18.901[elastic_agent][error] lazy acker: failed ack batch, enqueue for retry: [action_id: policy:fleet-server-policy:14:1, type: POLICY_CHANGE]
14:20:18.901[elastic_agent][info] Spawned new unit http/metrics-monitoring: Starting: spawned pid '22771'
14:20:18.904[elastic_agent][info] Component state changed system/metrics-default (STARTING->HEALTHY): Healthy: communicating with pid '22678'
14:20:18.904[elastic_agent][info] Component state changed log-default (STARTING->HEALTHY): Healthy: communicating with pid '22639'
14:20:18.904[elastic_agent][info] Component state changed fleet-server-default (STARTING->HEALTHY): Healthy: communicating with pid '22624'
14:20:18.904[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22685'
14:20:18.924[elastic_agent][info] Component state changed beat/metrics-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22743'
14:20:19.009[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
14:20:19.009[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
14:20:19.011[elastic_agent][info] Unit state changed log-default-logfile-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1 (STARTING->HEALTHY): Healthy
14:20:19.011[elastic_agent][info] Unit state changed log-default (STARTING->HEALTHY): Healthy
14:20:19.029[elastic_agent][info] Unit state changed system/metrics-default-system/metrics-system-aeee1beb-e6eb-4d9d-895e-8085d06d13c1 (STARTING->HEALTHY): Healthy
14:20:19.029[elastic_agent][info] Unit state changed system/metrics-default (STARTING->HEALTHY): Healthy
14:20:19.037[elastic_agent][info] Unit state changed beat/metrics-monitoring-metrics-monitoring-beats (STARTING->HEALTHY): Healthy
14:20:19.037[elastic_agent][info] Unit state changed beat/metrics-monitoring (STARTING->HEALTHY): Healthy
14:20:19.084[elastic_agent][info] Component state changed http/metrics-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '22771'
14:20:19.195[elastic_agent][info] Unit state changed http/metrics-monitoring (STARTING->HEALTHY): Healthy
14:20:19.195[elastic_agent][info] Unit state changed http/metrics-monitoring-metrics-monitoring-agent (STARTING->HEALTHY): Healthy
14:20:19.212[elastic_agent][info] Unit state changed fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e (STARTING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:20:19.212[elastic_agent][info] Fleet gateway started
14:20:19.212[elastic_agent][info] Unit state changed fleet-server-default (STARTING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:20:19.424[elastic_agent][info] Unit state changed fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e (HEALTHY->CONFIGURING): Re-configuring
14:20:19.424[elastic_agent][info] Unit state changed fleet-server-default (HEALTHY->CONFIGURING): Re-configuring
14:20:24.501[elastic_agent][info] Unit state changed fleet-server-default-fleet-server-fleet_server-aeddffe9-0d25-4571-8242-9f059f74ce0e (CONFIGURING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:20:24.501[elastic_agent][info] Unit state changed fleet-server-default (CONFIGURING->HEALTHY): Running on policy with Fleet Server integration: fleet-server-policy
14:20:24.501.fleet_server[elastic_agent.fleet_server][info] Running on policy with Fleet Server integration: fleet-server-policy
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:42.056.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:20:43.210.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:20:53.871.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:20:57.895.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:20:59.813.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:21:01.333.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:21:03.619.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:21:03.619.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:21:25.184.fleet_server[elastic_agent.fleet_server][info] Starting policy coordinator
14:21:31.339.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:21:45.325.fleet_server[elastic_agent.fleet_server][info] applying new local metadata
14:21:45.325.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:22:05.765.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:22:22.731.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:22:36.410.fleet_server[elastic_agent.fleet_server][info] ApiKey fail authentication
14:22:36.410.fleet_server[elastic_agent.fleet_server][info] fail checkin
14:22:43.257.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:11.845.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:22.301.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:28.585.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:31.428.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:32.710.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:35.609.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:37.576.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:44.453.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:50.839.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:52.683.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:23:52.935.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:24:17.394.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:24:21.969.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:24:37.462.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:24:42.408.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:00.682.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:13.790.fleet_server[elastic_agent.fleet_server][info] ack event
14:25:15.221.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:21.962.fleet_server[elastic_agent.fleet_server][info] ApiKey fail authentication
14:25:21.962.fleet_server[elastic_agent.fleet_server][info] fail checkin
14:25:29.176.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:44.693.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:47.640.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:25:53.097.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:01.446.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:05.735.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:07.301.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:07.301.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:08.489.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:11.877.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:13.222.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:13.474.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:14.002.fleet_server[elastic_agent.fleet_server][info] applying new local metadata
14:26:14.002.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:15.488.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:24.018.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:25.825.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:27.769.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:29.176.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:26:42.218.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:04.982.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:07.200.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:15.509.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:22.545.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:22.545.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:23.049.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:30.780.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:35.096.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:41.441.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:41.694.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:27:47.886.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:29:58.071.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:30:09.965.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:30:32.283.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:30:34.606.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:30:36.573.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:30:43.724.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:31:07.780.fleet_server[elastic_agent.fleet_server][info] applying new local metadata
14:31:07.780.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:31:39.953.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:31:57.495.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:32:23.061.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:32:35.283.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:32:57.416.fleet_server[elastic_agent.fleet_server][info] applying new components data
14:33:03.684.fleet_server[elastic_agent.fleet_server][info] applying new components data
Log of an Elastic-Agent tried to upgrade from 8.6.1 to 8.6.2 and in stuck update state:
06:51:30.562[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '1468' exited with code '2'
06:51:30.562[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '1468' exited with code '2'
06:51:30.562[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '1468' exited with code '2'
06:51:31.989[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '96664'
06:51:31.989[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '96664'
06:51:31.989[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '96664'
06:51:32.171[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '96664'
06:51:32.280[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
06:51:32.280[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
08:49:46.754[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '96664' exited with code '2'
08:49:46.754[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '96664' exited with code '2'
08:49:46.754[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '96664' exited with code '2'
08:49:47.755[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '122307'
08:49:47.755[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '122307'
08:49:47.755[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '122307'
08:49:47.948[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '122307'
08:49:48.057[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
08:49:48.058[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
10:29:26.565[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '122307' exited with code '2'
10:29:26.565[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '122307' exited with code '2'
10:29:26.565[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '122307' exited with code '2'
10:29:27.566[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '143787'
10:29:27.566[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '143787'
10:29:27.566[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '143787'
10:29:27.745[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '143787'
10:29:27.854[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
10:29:27.854[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
11:02:02.369[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '143787' exited with code '2'
11:02:02.370[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '143787' exited with code '2'
11:02:02.370[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '143787' exited with code '2'
11:02:03.370[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '150802'
11:02:03.370[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '150802'
11:02:03.370[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '150802'
11:02:03.595[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '150802'
11:02:03.703[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
11:02:03.703[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
12:13:26.289[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '150802' exited with code '2'
12:13:26.289[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '150802' exited with code '2'
12:13:26.289[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '150802' exited with code '2'
12:13:27.290[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '166412'
12:13:27.290[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '166412'
12:13:27.290[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '166412'
12:13:27.382[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '166412'
12:13:27.491[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
12:13:27.491[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
14:11:13.648[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:17:45.165[elastic_agent][error] Checkin request to fleet-server succeeded after 1 failures
14:20:02.122[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:25:50.294[elastic_agent][info] signal "terminated" received
14:25:50.294[elastic_agent][info] Shutting down Elastic Agent and sending last events...
14:25:50.294[elastic_agent][error] failed accept conn info connection: accept tcp 127.0.0.1:6788: use of closed network connection
14:25:50.294[elastic_agent][warn] Possible transient error during checkin with fleet-server, retrying
14:25:50.294[elastic_agent][error] checkin retry loop was stopped
14:25:50.395[elastic_agent][info] Shutting down completed.
14:25:50.395[elastic_agent][info] Stats endpoint (127.0.0.1:6791) finished: accept tcp 127.0.0.1:6791: use of closed network connection
14:25:51.209[elastic_agent][info] APM instrumentation disabled
14:25:51.211[elastic_agent][info] Gathered system information
14:25:51.221[elastic_agent][info] Detected available inputs and outputs
14:25:51.221[elastic_agent][info] Capabilities file not found in /opt/Elastic/Agent/capabilities.yml
14:25:51.221[elastic_agent][info] Determined allowed capabilities
14:25:51.259[elastic_agent][info] Parsed configuration and determined agent is managed by Fleet
14:25:51.274[elastic_agent][info] Starting stats endpoint
14:25:51.274[elastic_agent][info] Metrics endpoint listening on: 127.0.0.1:6791 (configured: http://localhost:6791)
14:25:51.274[elastic_agent][info] Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
14:25:51.485[elastic_agent][info] restoring current policy from disk
14:25:51.498[elastic_agent][info] Source URI changed from "https://artifacts.elastic.co/downloads/" to "http://nexus.x.local/repository/proxy-raw-elasticagent/"
14:25:51.502[elastic_agent][info] Updating running component model
14:25:51.573[elastic_agent][info] Spawned new component log-default: Starting: spawned pid '195728'
14:25:51.573[elastic_agent][info] Spawned new unit log-default: Starting: spawned pid '195728'
14:25:51.573[elastic_agent][info] Spawned new unit log-default-logfile-system-e7a753ee-cb91-43ad-baa0-dc46eac5f844: Starting: spawned pid '195728'
14:25:51.625[elastic_agent][info] Spawned new component udp-default: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-b437a902-3d93-48f0-8024-98482437ec3f: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-d463c3e6-d3a9-4c03-82a6-ba63851c21b8: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-1fa4754e-1f50-4bc8-a2d8-a88e51c53706: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-382e57fd-2a37-4f73-8dbd-7d2038ff2c29: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-a06f2d81-5545-4f65-b544-55a553377a30: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-1cec846b-577e-43ac-9b8c-9428396655da: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-cdd5144f-ac7a-4e53-bfd3-e8ea901c202b: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-bc297100-3632-43f1-a9af-3d231639da6d: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-c88effcf-6dd3-43fd-a121-8f0235ecc075: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-cisco_asa-880f0456-0308-4a94-88bb-9925cd2feef8: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-udp-8462ad5c-15ba-4b09-b1fa-f81d2668e83b: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-panw-032ef257-2d77-40c0-8e71-0070be23d5e0: Starting: spawned pid '195752'
14:25:51.626[elastic_agent][info] Spawned new unit udp-default-udp-Cisco ISE-4de3b518-19d6-4dc2-b436-72dbab631c8b: Starting: spawned pid '195752'
14:25:51.741[elastic_agent][info] Spawned new component netflow-default: Starting: spawned pid '195805'
14:25:51.741[elastic_agent][info] Spawned new unit netflow-default-netflow-netflow-0233a326-2634-473f-ab9b-6b6e40036fd9: Starting: spawned pid '195805'
14:25:51.741[elastic_agent][info] Spawned new unit netflow-default: Starting: spawned pid '195805'
14:25:51.785[elastic_agent][info] Spawned new component tcp-default: Starting: spawned pid '195822'
14:25:51.785[elastic_agent][info] Spawned new unit tcp-default-tcp-tcp-26611d2f-e89a-4cdc-81b3-3553c96ccf8d: Starting: spawned pid '195822'
14:25:51.785[elastic_agent][info] Spawned new unit tcp-default: Starting: spawned pid '195822'
14:25:51.864[elastic_agent][info] Spawned new component endpoint-default: Starting: endpoint service runtime
14:25:51.864[elastic_agent][info] Spawned new unit endpoint-default-c983b399-037b-4dcd-b6fc-71d439364660: Starting: endpoint service runtime
14:25:51.864[elastic_agent][info] Spawned new unit endpoint-default: Starting: endpoint service runtime
14:25:52.018[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '195876'
14:25:52.018[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '195876'
14:25:52.018[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '195876'
14:25:52.021[elastic_agent][info] Component state changed netflow-default (STARTING->HEALTHY): Healthy: communicating with pid '195805'
14:25:52.021[elastic_agent][info] Component state changed log-default (STARTING->HEALTHY): Healthy: communicating with pid '195728'
14:25:52.021[elastic_agent][info] Component state changed tcp-default (STARTING->HEALTHY): Healthy: communicating with pid '195822'
14:25:52.021[elastic_agent][info] Component state changed udp-default (STARTING->HEALTHY): Healthy: communicating with pid '195752'
14:25:52.028[elastic_agent][error] 2023-02-17 13:25:52: debug: ProcFile.cpp:1038 Found 1 cgroups for pid(195872)
14:25:52.028[elastic_agent][error] 2023-02-17 13:25:52: debug: ProcFile.cpp:1044 cgroup: id=0 type= path=/system.slice/elastic-agent.service
14:25:52.028[elastic_agent][error] 2023-02-17 13:25:52: info: MainPosix.cpp:341 Verifying existing installation
14:25:52.028[elastic_agent][error] 2023-02-17 13:25:52: info: InstallLib.cpp:616 Running [/opt/Elastic/Endpoint/elastic-endpoint] [version --log stdout]
14:25:52.029[elastic_agent][error] 2023-02-17 13:25:52: debug: Exec.cpp:187 ChildMonitor is pid 195880 and monitoring pids 195872 and 195878
14:25:52.125[elastic_agent][info] Unit state changed tcp-default (STARTING->HEALTHY): Healthy
14:25:52.125[elastic_agent][info] Unit state changed tcp-default-tcp-tcp-26611d2f-e89a-4cdc-81b3-3553c96ccf8d (STARTING->HEALTHY): Healthy
14:25:52.126[elastic_agent][info] Unit state changed netflow-default (STARTING->HEALTHY): Healthy
14:25:52.126[elastic_agent][info] Unit state changed netflow-default-netflow-netflow-0233a326-2634-473f-ab9b-6b6e40036fd9 (STARTING->HEALTHY): Healthy
14:25:52.128[elastic_agent][info] Unit state changed log-default-logfile-system-e7a753ee-cb91-43ad-baa0-dc46eac5f844 (STARTING->HEALTHY): Healthy
14:25:52.128[elastic_agent][info] Unit state changed log-default (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-d463c3e6-d3a9-4c03-82a6-ba63851c21b8 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-8462ad5c-15ba-4b09-b1fa-f81d2668e83b (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-a06f2d81-5545-4f65-b544-55a553377a30 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-382e57fd-2a37-4f73-8dbd-7d2038ff2c29 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-c88effcf-6dd3-43fd-a121-8f0235ecc075 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-1fa4754e-1f50-4bc8-a2d8-a88e51c53706 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-bc297100-3632-43f1-a9af-3d231639da6d (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-Cisco ISE-4de3b518-19d6-4dc2-b436-72dbab631c8b (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-cisco_asa-880f0456-0308-4a94-88bb-9925cd2feef8 (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-1cec846b-577e-43ac-9b8c-9428396655da (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-cdd5144f-ac7a-4e53-bfd3-e8ea901c202b (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-udp-b437a902-3d93-48f0-8024-98482437ec3f (STARTING->HEALTHY): Healthy
14:25:52.152[elastic_agent][info] Unit state changed udp-default-udp-panw-032ef257-2d77-40c0-8e71-0070be23d5e0 (STARTING->HEALTHY): Healthy
14:25:52.166[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '195876'
14:25:52.229[elastic_agent][error] 2023-02-17 13:25:52: info: InstallLib.cpp:656 Installed endpoint is expected version (version: 8.6.1, compiled: Thu Jan 19 23:00:00 2023, branch: 8.6, commit: 96e72414928d28f380d6a41b03aff24bfdcb816a)
14:25:52.277[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
14:25:52.277[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
14:25:52.414[elastic_agent][info] Fleet gateway started
14:25:52.419[elastic_agent][info] Updating running component model
14:26:20.920[elastic_agent][info] Component state changed endpoint-default (STARTING->HEALTHY): Healthy: communicating with endpoint service
14:26:20.920[elastic_agent][info] Unit state changed endpoint-default (STARTING->HEALTHY): Applied policy {c983b399-037b-4dcd-b6fc-71d439364660}
14:26:40.921[elastic_agent][info] Unit state changed endpoint-default-c983b399-037b-4dcd-b6fc-71d439364660 (STARTING->HEALTHY): Applied policy {c983b399-037b-4dcd-b6fc-71d439364660}
14:28:52.294[elastic_agent][info] Component state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '195876' exited with code '2'
14:28:52.294[elastic_agent][info] Unit state changed filestream-monitoring (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '195876' exited with code '2'
14:28:52.294[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (HEALTHY->STOPPED): Suppressing FAILED state due to restart for '195876' exited with code '2'
14:28:53.295[elastic_agent][info] Spawned new component filestream-monitoring: Starting: spawned pid '196782'
14:28:53.295[elastic_agent][info] Spawned new unit filestream-monitoring-filestream-monitoring-agent: Starting: spawned pid '196782'
14:28:53.295[elastic_agent][info] Spawned new unit filestream-monitoring: Starting: spawned pid '196782'
14:28:53.472[elastic_agent][info] Component state changed filestream-monitoring (STARTING->HEALTHY): Healthy: communicating with pid '196782'
14:28:53.581[elastic_agent][info] Unit state changed filestream-monitoring-filestream-monitoring-agent (STARTING->HEALTHY): Healthy
14:28:53.581[elastic_agent][info] Unit state changed filestream-monitoring (STARTING->HEALTHY): Healthy
Im also experiencing errors on all my agents after updating them to 8.6.1:
09:39:03.908
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-1bac1134-5619-4bd9-938d-0c680ff58a9a.coredns (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:03.908
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-b9006164-da1a-49cb-af67-239269670315.proxy (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:03.908
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-01b23588-8a55-4fff-94db-6018d6a2d588.aws-vpc-cni-init (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:03.908
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-b9006164-da1a-49cb-af67-239269670315.ingress-controller (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:03.908
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-246587fa-86cb-4264-bba0-954e62d34c93.application-controller (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:03.909
elastic_agent
[elastic_agent][error] Unit state changed filestream-default-filestream-container-logs-7a716136-d362-42b5-9128-dc63bbaf756b-kubernetes-510fece8-fd6c-4d23-9b94-43fbd3749c97.coredns (STARTING->FAILED): Failed: pid '448' exited with code '-1'
09:39:04.809
elastic_agent.metricbeat
[elastic_agent.metricbeat][error] Error fetching data for metricset http.json: error making http request: Get "http://unix/stats": dial unix /usr/share/elastic-agent/state/data/tmp/filestream-default.sock: connect: connection refused
09:39:04.809
elastic_agent.metricbeat
[elastic_agent.metricbeat][error] Error fetching data for metricset http.json: error making http request: Get "http://unix/stats": dial unix /usr/share/elastic-agent/state/data/tmp/filestream-default.sock: connect: connection refused
09:39:05.088
elastic_agent.filebeat
[elastic_agent.filebeat][warn] Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
09:39:05.096
elastic_agent.filebeat
[elastic_agent.filebeat][warn] Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
While it's not a solution, a workaround can be found in this issue in github. It may or may not be related to a second issue.
I have used the following API call, as noted in the first of the two posts, and Elastic Agent has successfully upgraded and returned to a healthy status on the hosts I've tested it on:
I think there may actually be a regression in Fleet. I just looked at my Fleet logs today, and I moved several agents from one policy to another in Fleet nearly a week ago, and that activity is still showing as in progress, even though I can look at the new policy and see the agents in the new policy and can also see data being gathered from them.
Thanks for opening the bug report, this indeed seems like a regression in recent versions, we can reproduce it locally. Please check the latest status on the github issue, hopefully we can find a fix soon.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.