I have a processor that sends vulnerability reports to an index, and thousands of documents can be ingested within seconds. I've set up an alert through security using custom query, where the query just filters for documents of a high risk score.
I see 500+ documents meeting the query requirements within the look back time, so I expect 500+ alerts.
I get a mixture of 300-400+ alerts every time I ingest the documents. I ran the same ingestion every hour to ensure the look back is on working on documents that got ingested together. Sometimes I get 100 alerts every interval until it reaches 300-400+ alerts total, when I increase Kibana's memory, I was able to get a lump sum of 300-400+ alerts in one interval, but the number of alerts I get are not consistent nor do they match the number of expected alerts (1:1 to qualified documents).
I suspected some documents may be ill formatted, so I indexed a singular document that wasn't alerted and that showed up on its own. So the documents are formatted properly and ECS compliant.
Cluster is setup with default ECK Helm 2.9.0, Elasticsearch and Kibana at 8.9.0.
I am not sure what might be causing this issue, or if it is a know issue (can't find other mentions), or if alerting for thousands of documents that were ingested at virtually the same time on a per document basis is not an intended functionality of the security alert.
If you are expecting 500 alerts every rule run, I would expand the
max_signals field on the rule in question if you haven't done so already. Perhaps to 500 or 1000, something in that ballpark. It defaults to 100 which is probably why you're seeing the chunks of 100 every interval. Besides that, do you know if your data has a lot of docs with identical timestamps? Might help to triage where the issue you're facing is
I have increased the
max_signals to 1000 before already.
I didn't look too closely at the timestamp but I would suspect a lot of documents are since many of them are all bulk created at once. Would that be a problem?
We have a current known issue that can cause problems when sets of documents have the same timestamp. We're currently writing up a github issue I'll link soon, but the best workaround currently is to ensure each document has a unique timestamp. Are you using a timestamp override field in your rule definition? If you are, the field you'll want to check will be based on that configuration, otherwise it'll be the
Also when updating rules through the UI (editing the rule and saving changes),
max_signals is reset to 100 so double check the rule object to see if it's what you expect it to be.
I see, thanks for pointing this out and linking the issue. I'll mark this as the solution for now as it resolves all my confusion around this issue.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.