Attack Discovery Questions and Feedback

willemdh · July 31, 2024, 7:30am

Hello,

So we updated to 8.14.3 earlier this week and I had the chance to play a bit with the Attack Discovery feature.

It started by hitting the max content length of 8K tokens a lot.

Asked our Azure engineer to provide a model with more tokens.

In the meantime I decreased the Alerts knowledgebase from 30 to 10.

The results were mixed imho. Most of the time we have between 250 and 400 alerts a day. How does it decide what alerts to analyse?

I noticed it choose a lot of alerts triggering from the rule "Multiple Alerts in Different ATT&CK Tactics on a Single Host". After closing all these alerts they were still chosen by Attack Discovery.

Other time I saw it choose 10 identical alerts. Imho it's not very useful when 10 exactly the same alerts are being analysed?

Imho it would be nice if we could have some configuration options to help decide what alerts to analyse. For example only open alerts, only alerts in a specific time windows, only host or only network alerts, maybe even allow us to select the specific alerts to analyse?

Best regards,

WillemD

Andrew_G · July 31, 2024, 5:29pm

Thanks for your feedback @willemdh!

It started by hitting the max content length of 8K tokens a lot.

Asked our Azure engineer to provide a model with more tokens.

In general, consider using one of the models in the Excellent category for Attack discovery in the Large language model performance matrix documentation.

The matrix is current as of 8.14.3, but FYI GPT-4o Attack discovery output will be improved via this PR targeting an upcoming release.

In the meantime I decreased the Alerts knowledgebase from 30 to 10.

The default of 20 alerts is typically under the token limits of the models in the matrix.

The number of input tokens consumed is influenced by:

The number of alerts sent as context (configured via the Knowledge base setting)
The fields sent in each alert (configured via the Anonymization setting)
The amount of data present in each field, which varies from alert-to-alert

It's therefore possible to send less data, and less data per-alert via customized settings. However it's worth noting that the current generation of LLMs are also limited by output tokens. Thus it's possible to reduce the size of the input by decreasing the number of alerts sent, or sending less data per alert, but reducing both may not be enough to overcome the output token limits of a model.

We're experimenting with potential improvements to Attack discovery that work around some of the token limitations of the current generation of models.

Most of the time we have between 250 and 400 alerts a day. How does it decide what alerts to analyse?

Attack discovery analyzes opened and acknowledged alerts from the last 24 hours. It includes up to n alerts (default: 20) as configured by the Knowledge Base setting.

The n alerts sent as context are sorted by risk score. Using the default of 20 as an example, the 20 alerts with the highest risk score in the last 24 hours will be sent as context.

I noticed it choose a lot of alerts triggering from the rule "Multiple Alerts in Different ATT&CK Tactics on a Single Host". After closing all these alerts they were still chosen by Attack Discovery.

The global date selector on the Security > Alerts page defaults to Today, which depending on when the alerts were generated, may include a different set of alerts than Last 24 hours. To approximate a view of the alerts sent as context for Attack discovery via the Alerts page:

Select Last 24 hours from the Security > Alerts global date picker
Filter the page for open and acknowledged alerts
Sort the Risk Score column in the alerts table from High to low
Open the Sort fields popover on the alerts table, and drag Risk score above @timestamp to sort first by Risk Score, and then by @timestamp
Select Rows per page: 20 (or higher) to match the configured number of alerts in the Knowledge base setting

The above is illustrated by the screenshot below:

Other time I saw it choose 10 identical alerts. Imho it's not very useful when 10 exactly the same alerts are being analysed?

Imho it would be nice if we could have some configuration options to help decide what alerts to analyse. For example only open alerts, only alerts in a specific time windows, only host or only network alerts, maybe even allow us to select the specific alerts to analyse?

Thank you for sharing this. We're exploring improvements to Attack discovery that may include additional filtering of the alerts that serve as input.

jamesspi · July 31, 2024, 5:35pm

Thanks for the response, @Andrew_G !

@willemdh I'll just add that that the standard token limits these days is 128k and beyond. You're actually paying more for GPT-4 at 8k, then GPT-4o at 128K.

James

willemdh · August 1, 2024, 4:34pm

Thanks (again) for all the info @Andrew_G and @jamesspi . I'll definitely test this and will experiment with other models asap.

system · August 29, 2024, 4:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Discover - Limit events returned Kibana	4	1906	October 17, 2017
[ threat intelligence ]: Display the number of alerts generated by a specific rule Kibana canvas	2	346	February 5, 2021
Detection Alerts - Want To Only See that Alert SIEM elastic-stack-alerting	8	577	January 21, 2021
Alerts generated by watches Elasticsearch elastic-stack-alerting	3	857	July 6, 2017
Kibana limit number of alerts Kibana elastic-stack-alerting	5	1566	March 23, 2022

Attack Discovery Questions and Feedback

Related topics