I have a cluster running in a lab that I use to test out things before we do them in production. This cluster started out on version 7.14.x roughly. Fleet and agents have been installed and upgraded several times, through 7.16, 7.17, 8.0, 8.1 and recently to 8.2. I believe the cluster was on 8.1.2 and instead of going to 8.1.3, I just decided to upgrade to 8.2.
Upgrading the Elasticsearch nodes seemed to go just fine, but kibana would not start after the upgrade. All the Elastic Security Rules and all the Agent Policies were missing. I didn't have /var/log/messages logging at that time, so I think in journctl I found 2 errors that I corrected and Kibana was able to start after that. However the rules and policies were/are missing.
I can't find the errors now, but one of them was fixed with this command. I believe I had to use journalctl to view them.
curl -k -u -XPUT 'https://localhost:9200/_cluster/settings {"transient": {"cluster.routing.allocation.enable": null}, "persistent": {"cluster.routing.allocation.enable": null}}'
Can anyone provide some context to what might have happened and if the missing data is still available to be retrieved or recovered?
What I have done so far is; created new policies for fleet server and default 1, and windows server 1 policies. And I have redeployed the fleet server integration and 2 agents. As there was no way to upgrade or unenroll the agents I found that they could be forcibly removed by using this api
curl -k -u elastic:homelab --request POST --url https://localhost:5601/api/fleet/agents/<agent_id>/unenroll --header 'content-type: application/json' --header 'kbn-xsrf: xx' --data-raw '{"force":true,"revoke":true}' --compressed
So I could continue to remove the agents and recreate the policies. However, it might be good to try to find out what happened and why. If anyone has any suggestions as things to check or information to retrieve to answer that question, I'd be interested to know.
Edit: I added the screenshot. The 3 policies shown are ones I recreated. I reinstalled their respective agents. The rest of the list showing the agents without a policies are the old agents. Also I notice now that the 2 test spaces I had created months ago are also missing.
Thanks!
Robert