Those two errors are actually related to endpoint and not to the problem that you're experiencing. But I'm not sure we should be reporting those two lines as an error so I'm going to look at that.
I'll see if I can find someone who knows more about the orphaned status though to respond.
edit: I responded too quickly. Those lines endpoint logged as info correctly, agent reported them as error.
Thanks for letting us know. We've just recently run into similar issue on one test setup, where it happened after stack upgrade.
Was your stack recently upgraded to 9.0?
A little explanation
The Orphaned comes from audit written by "orphaned" Endpoint. The stack communicates with Elastic Endpoint via Elastic Agent. If Agent stops working Endpoint sends orphaned audit to clearly differentiate between Offline state, as otherwise such Agent would appear just offline.
We will continue to look for the root cause internally.
In the meantime I'd recommend you to check Endpoint service status
If all appears fine, on Agent and Endpoint services side, then it's only issue with resetting the audit, which we suspect it's the case.
It's not very convenient to fix the state. Do you have many endpoints/agents affected?
You can reset the audit for the affected Agent, but it requires document update. The agent audit document on .fleet-agents index contains unenrolled reason/time which is causing the issue. However to delete the nodes a document update has to be made, as you know Elasticsearch doesn't have a query syntax to just delete/alter node of a document.
Currently we have three affected agents and thank you very much for your answer.
I will not touch the .fleet-agents index and wait patiently for a fixed version - after all the issue looks only cosmetical to me - functionality is not impacted.
Since I found no convenient or supported way to get elastic-agents out of “stuck” or “erroneously displayed” states in kibana→fleet→agents, I did it inconveniently and perhaps unsupportedly this way (thanks @lesio for pointing me this direction):
Disclaimer: don’t try this on your production ELK.. I guess..
Get yourself some privileges on an internal, hidden system-index:
Delete all ancient, antique, old or not-recent documents from the index (e.g. all docs except the last one in above screenshot..)
Fix the current document - painlessly (but highly discouraged..) until it looks equivalent to the agent’s real state (which you should check locally - we often see already upgraded agents on local systems that are displayed with a lower version in kibana -resisting every upgrade attempt via fleet..)
I cleared the “orphaned” string last - after resetting all possibly incorrect date-fields (using painless like above..)
btw: I hope we get a convenient and supported way to fix this when the future major-version comes along:
#! this request accesses system indices: [.fleet-agents-7], but in a future major version, direct access to system indices will be prevented by default
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.