Hello,
I'm facing a problem with detection rules on a stack ELK. I need to upload events from past timestamp and check it with detection rules.
For example, I will upload a ton of event that look like this: {"timestamp":"2024-03-19 13:57:47.903Z","event.code":"1000"} with always a timestamp in the past (could be days, months or years in the past).
Then I have a security detection rule with the following settings:
For now, the only way that I found to make it working it's by having a huge additional look-back time on each rules. With this setting, the detection engine will check also the past events.
I want to know if a better way of doing it exists because I feel like it's not the right and the most effective way of doing it.
This is a great question. In short, you're right that "it's not the right and the most effective way of doing it". Unfortunately, as of today the app doesn't offer a feature that would handle this use case properly. But this is quite high in the list for development by priority.
As a workaround, tweaking the Additional look-back time parameter to cover a long time interval (days, months or years in the past) can work in some cases, for example when you have not many source events within the interval. But in general, it wouldn't work well or as you'd expect because:
The larger the interval, the more source events are within it, and the more resources a rule needs for processing them -- per each rule run. On each run the rule will need to query and process a lot of events, and deduplicate most of them to avoid generating the same alerts again and again. If there's too much data to process and/or Elasticsearch is under load, the rule will be increasing the load on the cluster and can start failing by timeout. This doesn't scale well neither for a single rule (when you need to increase its interval more and more), nor for many rules (when you need to apply this workaround to more and more rules).
Each detection rule has a circuit breaker parameter called max_signals that limits the maximum number of detection alerts the rule can generate per each run. By default, it equals 100. In some cases, this circuit breaker can alleviate the issues described above, at the cost of "missing alerts". For example, if there are 10000 source events matching the rule's query within the interval, the rule will generate only 100 alerts, and not generate ("miss") 9900 alerts.
This is a difficult problem to solve at scale properly, and will require a lot of effort from several teams in Kibana. There are public GitHub issues related to this problem:
As another workaround, if you only need to know if there are potentially any alerts within a large interval in the past, and what alerts it would be, and you don't need them in the .alerts-security.alerts-* index, you could use the rule preview feature that scales a bit better than the trick with the look-back parameter. The preview alerts are the same alert documents, but written to the .preview.alerts-security.alerts-* index instead.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.