Security Detection Rules Cause: `circuit_breaking_exception` on medium-ish deployments

BenB196 · October 13, 2021, 6:43pm

Hi All,

I noticed that a few detection rules consume a lot of memory, and cause circuit_breaking_exception often in medium-ish deployments (~265 winlogbeat deployments).

Elasticsearch version: 7.14.1

Offending Rules:

Installation of Custom Shim Databases
Parent Process PID Spoofing
Potential Process Herpaderping Attempt

These rules all seem to use a good amount of memory to run, here is generally the exception I see:

An error occurred during rule execution: message: "circuit_breaking_exception: [circuit_breaking_exception] Reason: [eql_sequence] Data too large, data for [sequence_inflight] would be [3221924600/3gb], which is larger than the limit of [3221225472/3gb]" name: "Installation of Custom Shim Databases" id: "ac10bafe-a91f-11eb-a252-7f35a8822039" rule id: "c5ce48a6-7f57-4ee8-9313-3d0024caee10" signals index: ".siem-signals-security"

Has anyone else run into this issue with rules, if you did, how did you solve it?

BenB196 · October 13, 2021, 6:46pm

Further context, the rules are executed across 3 coordinating only nodes in the cluster each with 8GB of RAM (6GB of Heap).

RylandHerrick · October 14, 2021, 11:44pm

Hi @BenB196! I can't say for certain, but since all of those rules are EQL rules, I'm guessing that you may be missing fields that the EQL sequence is attempting to join upon. It's likely process.entity_id, but you'd have to verify that.

What EQL does in the case of a missing join field is to treat them as null and join on that value, which results in a lot of unexpected results/sequences, and causes the circuit breaking exception you're seeing.

While we're actively discussing a fix for this behavior, the near term solution would be to duplicate those rules and edit the sequence to either remove that join field, or substitute it for a more appropriate one for your data.

I hope that helps, cheers!

warkolm · October 15, 2021, 12:11am

What is the output from the _cluster/stats?pretty&human API in Elasticsearch?

BenB196 · October 18, 2021, 1:50pm

Here is the output of the requested command: { "_nodes" : { "total" : 30, "successful" : 30, "failed" : 0 - Pastebin.com

Note: I've upgraded the cluster to 7.15.1 since originally making this post.

warkolm · October 19, 2021, 2:55am

Thanks for that. What size heaps do you run, it looks like ~6GB per node?

BenB196 · October 19, 2021, 12:51pm

Yes, the coord nodes run ~6GB (they have 8GB total, and use the auto heap setting so whatever the Elasticsearch decides for their heap is what they get).

system · November 16, 2021, 12:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parent Circuit Breaking Exception Elasticsearch	4	183	January 30, 2024
Circuit break exception Elasticsearch	6	302	August 2, 2022
7.4.0 Circuit breaking exceptions Elasticsearch	8	2558	December 17, 2019
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:data/write/bulk[s][r]] Elasticsearch	14	7280	August 3, 2021
Circuit_breaking_exception Elasticsearch	13	1086	March 3, 2021

Security Detection Rules Cause: `circuit_breaking_exception` on medium-ish deployments

Related topics