Failed execution of ESQL query and high cpu load

Hi There,
I have a one node cluster here for testing.

Hardware Spec:
16 Core // 148GB // 6 SAS 15K Raid0

Everything is running smoothly except for Elastic Security.
As soon as i perform an action here, i get Kibana timeouts (50sec) and also several error messages regarding ESQL query and a load1 with 7 - 9.

[2023-11-13T10:41:51,722][INFO ][o.e.x.e.a.EsqlResponseListener] [elastic01] Failed execution of ESQL query.
Query string: [from .alerts-security.alerts-default,apm-*-transaction*,auditbeat-*,endgame-*,filebeat-*,logs-*,packetbeat-*,traces-apm*,winlogbeat-*,-*elastic
-cloud-logs-* | limit 10]
Execution time: [55969]ms

There are only 150 elastic original rules active.
Currently only 7 Elastic Agents are rolled out, i think that the hardware used should be sufficient.
Why is there such a strong loss of performance with Elastic Security?
Anyone have any ideas?

Hello, Would like to know if the performance get better if remove logs-* or query only one index pattern at a time?

Hi Angela,
thanks for your quick reply. No, i hadn't tested that yet.
Is it possible to see from the log file why the query fails or takes so long?

We might not always able to find the cause from the log file.
By querying the index pattern one by one could allow us to identify the pattern that is slower that expected.
We've experienced some cases that logs-* slows down the performance overall, therefore I suggested started from removing logs-* from the query and see if it improves.

OK, I'll give it a try.
Can you please show / explain me, where i can remove the logs-* ?

  1. In /app/management/kibana/dataViews, find the one with tag Security Data View, click on it. Select Edit, remove logs-*, from both Name field and Index pattern field

  2. Visit /app/security/timelines , create a new timeline or update an existing timeline, click ES|QL tab, remove logs-* from the query and click Update.

  3. Whenever you use ESQL query, try not to include logs-* in the query and observe if the performance improved.


If you'd like to know more about Data view: Create a Data view | Kibana Guide [8.11] | Elastic

Angela wow, :grinning:
that had been the problem, from 50sec and more of execution time we are now at 100ms on average. That is a significant step forward.
You should open a ticket if you haven't already done so.

Thank you for your help.
Stefan

Rejoiced a little too soon.
For some reason i don't understand, logs-* was added again in Data View.
Is there an automatism that does this (new index etc.) ?

There's another place you can remove logs-* from data view.

In Security Solution app, there should be a data view dropdown on the top-right corner, we can de-select logs-* from there as well.

Angela,
Yes OK, but the main problem is that for some reason log-* is automatically added to the data view again.
I have opened a ticket here so that this bug can be investigated, in the hope that this problem will be solved soon. It's no fun to use the security app because everything that has to do with the security app is so slow.
[Security Solution] Failed execution of ESQL query and high cpu load, Security Solution not usable · Issue #171108 · elastic/kibana (github.com)

Thanks so much for doing this, I've notified the team that's in charge of this feature.
At the mean time, if you deselect logs-* from the Data view dropdown in security app. The deselected item wouldn't come back again until you re-select it.

Angela,
if I remove logs-* from hosts, for example, i no longer have any data to display......

As logs-* includes all the indices match this pattern.
I'd recommend using GET /logs-*/_stats in dev tools to identify which indices you'd like to query from. Create a new data view that includes only those indices, so it'd be easier to tell which are the problematic ones.

I've looked into the reason why logs-* added back to SecuritySolution data view.
That's because we didn't change the default it uses in here:

Visit /app/management/kibana/settings?query=category:(securitySolution)

Remove logs-* from Elasticsearch indices.

Then we should be able to visit the /app/management/kibana/dataViews and removing the logs-* from there. When landing on Security app again, the logs-* shouldn't be added back automatically.

Angela,
thank you for your time and effort, but this is not really a solution.
So that i can see everything of relevance in the security app, i have added the indexes logs-endpoint*,logs-sophos.* instead of logs-*.
Yes, it has become a little faster, especially with smaller times frames, but 24h takes several seconds until a result is there and i have a load1 of 9 and that with this hardware.
In addition, the map is no longer displayed in the Security app under Network, supposedly an index is missing here.
All in all, this is an unsatisfactory solution and your team should invest more time here so that the performance of the security app improves significantly.

Thanks for the feedback, I've taken this issue back to the team and we're working on this.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.