Slow Event Analyzer queries

Hey Everyone,

I'm having some issues with the Event Analyzer for Elastic Defender rules.
They are extremely slow and time out quite often.

With the default elasticsearch.requestTimeout set to 30000 almost all of them time out. Increasing it to 90s makes about half of them time out.

Using the Event Analyzer from the main Alerts page is simply not possible, it times out every single time.

image

Going into a timeline and then trying to use it times out about 50% of the time, depending on the complexity of the rule and the underlying data that is being searched/visualized.

What's interesting is, if I copy the query from the timeline and use it in Discover, it is near instant. For example:

host.os.type:linux and event.category:process and event.type:start and event.action:exec and
process.executable:/usr/bin/systemctl and process.args:(enable or reenable or start) and 
process.entry_leader.entry_meta.type:* and
not (
  process.entry_leader.entry_meta.type:(container or init or unknown) or
  process.parent.pid:1 or
  process.parent.executable:(
    /bin/adduser or /bin/dnf or /bin/dnf-automatic or /bin/dockerd or /bin/dpkg or /bin/microdnf or /bin/pacman or
    /bin/podman or /bin/rpm or /bin/snapd or /bin/sudo or /bin/useradd or /bin/yum or /usr/bin/dnf or
    /usr/bin/dnf-automatic or /usr/bin/dockerd or /usr/bin/dpkg or /usr/bin/microdnf or /usr/bin/pacman or
    /usr/bin/podman or /usr/bin/rpm or /usr/bin/snapd or /usr/bin/sudo or /usr/bin/yum or /usr/sbin/adduser or
    /usr/sbin/invoke-rc.d or /usr/sbin/useradd or /var/lib/dpkg/*
  ) or
  process.args_count >= 5
)

I've also tried visiting the API endpoint that times out, for example https://example.com/api/endpoint/resolver/entity?_id=egz7tpEBlGjjuluGyJ-9&indices=.alerts-security.alerts-default&indices=logs-*&indices=traces-apm*, and it will time out on every single subsequent request.

If the requests succeeds, all subsequent requests to that API endpoint are sluggish like the first response.

I'd appreciate any tips on further troubleshooting.

ELK version : 8.15.0
Agent version : 8.15.0
Integration version: 8.15.1

Cheers,
Luka

Bumping this in hopes of getting a response

First thing to try would be to enable the setting securitySolution:excludeColdAndFrozenTiersInAnalyzer
in app/management/kibana/settings to be true, this will skip any analyzer queries from hitting cold and frozen data tiers, which can potentially cause exactly what you are seeing. If this doesn't help, the key will be paring down the logs-* index pattern used by default in the analyzer api calls to be something narrower, if using defend, logs-endpoint-* instead of logs-* can help tremendously, especially with a high number of indices matching logs-*. The process analyzer api is actually making a series of requests to elasticsearch to 'walk' the tree, and this is done iteratively for each process. The EQL query you shared is a different set of data, and it's currently not possible to show the same set of data in a single query in discover as what would be shown in a working analyzer. It's possible for a few elasticsearch queries to replicate the data, although this requires some additional mappings: Has child query | Elasticsearch Guide [8.15] | Elastic

Both of those seem to have helped. Especially narrowing down the index pattern it looks at.
One more thing to note, it has to have .alerts-security.alerts-default as an index pattern otherwise it just throws an Error Cannot load data.

image

Thank you for the help!