We are often getting a windows event code 7009 post reboot regarding elastic agent service not starting indicating: A timeout was reached (30000 milliseconds) while waiting for theElasticAgent service to connect. This prevents the process endpoint-security.exe from running.
When we manually start the service after it starts up it shows online and healthy in fleet. My question is, when the service fails to start with a 7009 event code, does that mean there is no endpoint security protection on the host until the service is started successfully?
As long as Endpoint security is starting properly and was previously working, it should start up and be running with the same policy that it was running with before the reboot.
We recently added a mitigation to help with this in the advanced policy section for endpoint that you could try setting to true. (but note that it will not take effect if Agent isn’t already running to delivery that new configuration to endpoint).
The Agent team is working on a complete fix to this issue that will hopefully be out soon (hopefully next minor, but thats not my team so I can’t commit them to that).
Thanks Nick, just to clarify. It didn’t start properly hence the 7009 event code reflecting process was not running. Is the host still protected if endpoint-security.exe process/service fails to start?
You should have two services. Elastic Agent (elastic-agent.exe) and Elastic Endpoint (elastic-endpoint.exe)
What I have generally seen is that the Elastic Agent service will fail to start due to timeout with the 7009 error but Elastic Endpoint will start properly.
In order for the host to be protected, Elastic Endpoint will need to be running. If you’re experiencing a 7009 error for the Elastic Endpoint service, that is something that is unusual and I would like to have someone on my team look into it further.
Not when elastic-security.exe fails to run, but when only elastic-agent.exe fails to run.
This is whyOrphaned indicator was added. It’s role is to tell you that Elastic Endpoint service is running and protecting the host, but it can’t communicate with Elastic Agent thus is unreachable from the stack for exceptions updates, host isolation toggle, any response actions…
Unfortunately we’re dealing with some corner cases where Orphaned state is falsely reported on HEALTHY setup, but we’ll eventually clean these issues
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.