With no errors reported on log files (elastic-agent, osquerybeat, osquery ...).
I note the following:
osqueryd is running.
queries are perfectly executed on the endpoint using osqueryi.
I keep getting Non-Zero Mertics messages on osquerybeat.
Is there any required configuration to use osquery ?
how does osquery manager read results from endpoint ?
can you guide me to troubleshoot this problem ?
Hi @A_Abdellah. The action undefined message may mean that there's a problem with osquery running on that agent. Here's some info that will help to troubleshoot this:
What Kibana version is this in?
Did you recently upgrade Osquery Manager, or is it a newly added integration?
What's the status of this agent in Fleet? (You can check that in Kibana from Fleet > Agents)
Is there any required configuration to use osquery ?
No, there shouldn't be any additional setup beyond adding the integration to the agent policy.
@aleksmaus may be able to better answer this question: "how does osquery manager read results from endpoint ?"
@A_Abdellah The 'action undefined' error means that the agent is not running osquerybeat.
Most likely you should be able to see in the agent logs that the agent can't find osquerybeat.exe file on the disk.
The full chain of the processes involved in running osquerybeat is the following: the elastic agent starts osquerybeat, osquerybeat starts osqueryd, osqueryd starts osquery-extension.
In some cases, on Windows, it looks like that osqueryd.exe process is still running after the agent is stopped and uninstalled, so the osquerybeat install directory is locked and could not be cleanly removed during uninstall. Then the next time the agent installs the osquerybeat, the osquerybeat install is skipped, because the directory is still there.
We are currently looking into improving the agent child processes handling and uninstall/install/upgrade steps.
Could you please confirm if the osquerybeat is missing from the osquerybeat install directory (osquerybeat install is corrupted)? The directory should have 3 binaries: osquerybeat, osqueryd, and osquery-extension.
The workaround for this problem at the moment would be to stop the agent, remove the corrupted osquerybeat install directory manually, then start the agent again. The agent should install osquerybeat if osquery_manager is enabled for the agent policy. You should be able to see the install logged in the agent log.
Another thing to try is to uninstall the agent, checked that all directories are removed and reinstall it.
If you you can capture any errors in the logs, please let us know.
Hello @aleksmaus, I stopped the agent and cleanly removed osquerybeat install folder then waited osquerybeat to be reinstalled, I can see osquerybeat runing on my endpoint from task manager and I still get 'action undefined' for my queries.
@A_Abdellah I just tested with 7.16.3 stack and the agent on Windows 10 Pro and everything worked out of the box.
What version of the stack you are running and on what platform?
Could you check couple of more things:
Is osquerybeat a child process of elastic-agent (not orphan)?
Should look something like this in sysinternals "Process Explorer" for example:
It's unclear how it became orphan at this point, there are two possibilities:
The agent process crashed for some reason.
The agent process was killed by the system.
You can kill the osquerybeat processes tree (and any other orphaned beats processes if they are running) and restart the agent.
The agent should re- create all the beats processes as child processes.
Use process explorer to see what is happening with the agent process tree during restart.
Maybe try to capture the agent logs if this happens again.
Check the system logs to see if there is anything pointing to the agent service process.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.