Osquery Live queries: windows endpoints not responding 'action undefined'

Hello,
I installed Osquery manager integration on my endpoints, for linux it worked like magic, while on the windows I get the following results

With no errors reported on log files (elastic-agent, osquerybeat, osquery ...).
I note the following:

  • osqueryd is running.
  • queries are perfectly executed on the endpoint using osqueryi.
  • I keep getting Non-Zero Mertics messages on osquerybeat.

Is there any required configuration to use osquery ?
how does osquery manager read results from endpoint ?
can you guide me to troubleshoot this problem ?

Thank You

Hi @A_Abdellah. The action undefined message may mean that there's a problem with osquery running on that agent. Here's some info that will help to troubleshoot this:

  • What Kibana version is this in?
  • Did you recently upgrade Osquery Manager, or is it a newly added integration?
  • What's the status of this agent in Fleet? (You can check that in Kibana from Fleet > Agents)

Is there any required configuration to use osquery ?

No, there shouldn't be any additional setup beyond adding the integration to the agent policy.

@aleksmaus may be able to better answer this question: "how does osquery manager read results from endpoint ?"

@A_Abdellah The 'action undefined' error means that the agent is not running osquerybeat.
Most likely you should be able to see in the agent logs that the agent can't find osquerybeat.exe file on the disk.

The full chain of the processes involved in running osquerybeat is the following: the elastic agent starts osquerybeat, osquerybeat starts osqueryd, osqueryd starts osquery-extension.

In some cases, on Windows, it looks like that osqueryd.exe process is still running after the agent is stopped and uninstalled, so the osquerybeat install directory is locked and could not be cleanly removed during uninstall. Then the next time the agent installs the osquerybeat, the osquerybeat install is skipped, because the directory is still there.

We are currently looking into improving the agent child processes handling and uninstall/install/upgrade steps.

Could you please confirm if the osquerybeat is missing from the osquerybeat install directory (osquerybeat install is corrupted)? The directory should have 3 binaries: osquerybeat, osqueryd, and osquery-extension.

The workaround for this problem at the moment would be to stop the agent, remove the corrupted osquerybeat install directory manually, then start the agent again. The agent should install osquerybeat if osquery_manager is enabled for the agent policy. You should be able to see the install logged in the agent log.
Another thing to try is to uninstall the agent, checked that all directories are removed and reinstall it.
If you you can capture any errors in the logs, please let us know.

Let us know if this works for you.

I'm using version 7.16.2 Kibana and Elastic agent, and osquery manager is at version 0.8.0

Hello @aleksmaus, I stopped the agent and cleanly removed osquerybeat install folder then waited osquerybeat to be reinstalled, I can see osquerybeat runing on my endpoint from task manager and I still get 'action undefined' for my queries.

@A_Abdellah I just tested with 7.16.3 stack and the agent on Windows 10 Pro and everything worked out of the box.
What version of the stack you are running and on what platform?

Could you check couple of more things:

  1. Is osquerybeat a child process of elastic-agent (not orphan)?
    Should look something like this in sysinternals "Process Explorer" for example:

  2. Double check that you send that query to agent where the osquery was enabled

The agent id on Kibana UI

should match the agent id in the fleet.yml file on the agent

You can enable debug log for the agent at the bottom of the agent details page (log tab)

There should be debug level logs in the agent log when the agent receives the osquery action and dispatches it to the osquerybeat

and osquerybeat log should show the query being executed and results sent back

Could please verify everything above and provide some additional information where it doesn't work as expected?

Thank you for the help debugging this issue.

@aleksmaus I'm running the version 7.16.2 of the stack on linux.

I saw that osquerybeat.exe is orphan. I think this is my lead now. how can I fix that??

It's unclear how it became orphan at this point, there are two possibilities:

  1. The agent process crashed for some reason.
  2. The agent process was killed by the system.

You can kill the osquerybeat processes tree (and any other orphaned beats processes if they are running) and restart the agent.
The agent should re- create all the beats processes as child processes.

Use process explorer to see what is happening with the agent process tree during restart.

Maybe try to capture the agent logs if this happens again.
Check the system logs to see if there is anything pointing to the agent service process.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.