Elastic fleet server fails to install after 8.6.0

"Fleet Server - Waiting on fleet-server input to be added to default policy","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-01-12T12:57:55.988-0800","log.origin":{"file.name":"cmd/enroll_cmd.go","file.line":792},"message":"Fleet Server - Waiting on active enrollment keys to be created in default policy with Fleet Server integration","ecs.version":"1.6.0"}

Upgrade from 8.5.3 to 8.6.0. Fleet agent dropped offline forcing a reinstall. Reinstall with v 8.6 fails the the very generic error code 1 along with the message following the install service logs shows it's ignore the fleet-server-es setting and attempting to connect to local with http://127.0.0.1:9200 which is clearly invalid if I'm providing a es server in the command line.

Trying to reload 8.5.3 now fails with the initial message of "Waiting on fleet-server input to be added to default policy". The policy hasn't changed from when it was working before the upgrade.

Got to love when it worked in dev but fails in production only to find out your account no longer shows you have support lol. The help pages are useless on errors like this.

Recreated the default fleet server policy and even did a few manual ones and changed the name for the enrollment steps. Same results the Google hasn't yielded many working results other then the same issue has happened several times before on different versions.

Testing the API key works from the server that is attempting to install Fleet Server.

curl --request POST
--url https://servername:5601/api/fleet/setup
--url https://servername:5601/api/fleet/setup
--header 'authorization: Bearer api_key'
--header 'content-type: application/json'
--header 'kbn-xsrf: x'
RESPONSE = {"isInitialized":true,"nonFatalErrors":}[

Agent 8.6.0 has some odd ball issues...

In order to enroll the default " --fleet-server-policy=fleet-server-policy " command line argument that is created on quick will fail every time. Using the actual ID from the policy like in advanced will get you a little longer into the install.

In the extracted files for install modify the elastic-search.yml with the API key along with changing to the policy ID allows the install to finish. Tested on 4 separate server all of which where failing using the autogenerated install options each work with the manual options and all agents are coming back online.

You may still end up getting {"log.level":"warn","@timestamp":"2023-01-12T14:43:02.234-0800","message":"read token request for getting IMDSv2 token returns empty: Put "http://169.254.169.254/latest/api/token\": context deadline exceeded (Client.Timeout exceeded while awaiting headers). No token in the metadata request will be used.","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"add_cloud_metadata","log.origin":{"file.line":81,"file.name":"add_cloud_metadata/provider_aws_ec2.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"} in the journal logs but that's an annoyance and creates more stuff to filter out.

I have the same problem, when I upgraded from version 8.5.3 to 8.6 my fleet server stopped and all my agents are offline, port 8220 is also down.

I'm having trouble getting it to work

Found a work around. Only works with the manual zip files installs not the rpm.

1.Uninstall and remove the existing fleet agent.
2. Optional - Delete the existing fleet server policy in kibana.
3. Manually create the fleet server policy.
4. When adding an fleet agent use the advanced options only unless you want to go and get the policy id manually. Advanced you will notice the ID will be in place and not the policy name.
5. On the server hosting fleet use the zip download and in the extracted files open the elasticsearch.yml file. Enter an ES node address where it's showing 127.0.0.1. Also enter the API key that is showing in the advanced flyout. Do not enter the username/password as the key is what takes care of the auth part. You can delete the installer folder after with no ill effects.
6. You can use either string for install at this point but make sure to change the fleet policy to match the ID name and not the display name. Of course its best to use a valid cert.

It looks like an issue with the es string on the installer is being ignored which means I won't be updating agent on any devices with the GUI in 8.6 and will be waiting. 8.5.x has it's own problems but at least it didn't drop ALL ingested data for hours causing a blind spot in the network. It's hard to say as if you try and reinstall the 8.5.3 you end up with an auth issue relating to the API key and seems like a version check is in place which is imho another quality control failure.

Hello @PublicName,

Could you please share some more information about your environment:

  • Which distribution you are running;
  • OS and platform used for installing the stack (Linux x86_64, RPM x86_64, RPM aarch64, etc..);

Also, could you please provide the steps you followed so that we can reproduce the issue? And if possible, some logs would be very helpful.

Thanks,
Cristina

@Cristina_Amico

Rocky Linux 8.5
Elasticsearch/Kibana installed from public repo.

yum update - wait for it to finish.
Go into fleet and update fleet server to 8.6 = fail.

Only logs I was able to snag was the snip ones from above that would repeat endlessly in the service startup. Nothing else was created which was obviously causing frustrations. Sorry not much help afraid this issues was also back in 7.13.