Frozen Tier CPU spikes - in relation to siem alerts

I have a been reconfiguring my elastic cloud setup to try and improve performance. I created a frozen tier node, alot of indices have moved from warm to searchable snapshots. However looking at the CPU of the frozen node i can see CPU spikes, these spikes coincide with spikes in alerts running and field_caps events on that node - Elasticsearch.audit.action: indices:data/read/field_caps[index][s]

I have used winlogbeats with rollover and have a about 150 winlogbeat indices in the searchable snapshots that the field_caps event runs against.

I am looking to see if this is expected/normal. I am moving away from winlogbeats to the fleet integrations/streams and hoping this will help. It seems a bit odd that old indices are still causing cpu usage for alerts.

Hoping that someone can help shed some light on this so that i can better tune/plan for the future.

Some metrics:

What is the output from the _cluster/stats?pretty&human API?

These stats were taken at a time of the day when ingest is low, the same CPU spikes still ongoing on the frozen node

Cool thanks, there's nothing obvious there.

What's the output from the _nodes/hot_threads API when these spikes happen?

How many days before the data is moved into Frozen AND what is the lookback on the Alert that you are running?

Also did you use the ILM GUI to set up the ILM sound like you are running Hot / Warm / Frozen which is perfectly fine.

Apologies probably my confusion, are you saying the _ield_caps is running on its own or is that something you are purposely running.

I did use the ILM GUI, stored on hot for about 7 days, warm for 30 days, then frozen (searchable snapshots in Azure). The look back on the alerts is at max 60 minutes.

field_caps is not something i have run, just a events that i see on all of the nodes but trying to corrolate between the siem alerts running, events on the frozen node and the CPU spikes. Within the event, the field Elasticsearch.audit.indices lists all of the winlogbeat-00000xx indexes from hot, warm to frozen.

Fairly difficult to capture but hopefully helpful
Heres the output of all

and heres a focus on the frozen node

1 Like

Curious @probson do you have a commercial license/ support?

Since I think you are in Elastic Cloud support is included.

This would seem like an excellent candidate for opening a ticket.

I do not have an understanding what is happening.

1 Like

I do, were on cloud platinum

I will raise a ticket, thanks for taking a look though

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.