Attack Discovery Questions and Feedback

Hello,

So we updated to 8.14.3 earlier this week and I had the chance to play a bit with the Attack Discovery feature.

It started by hitting the max content length of 8K tokens a lot.

image

Asked our Azure engineer to provide a model with more tokens.

In the meantime I decreased the Alerts knowledgebase from 30 to 10.

The results were mixed imho. Most of the time we have between 250 and 400 alerts a day. How does it decide what alerts to analyse?

I noticed it choose a lot of alerts triggering from the rule "Multiple Alerts in Different ATT&CK Tactics on a Single Host". After closing all these alerts they were still chosen by Attack Discovery.

Other time I saw it choose 10 identical alerts. Imho it's not very useful when 10 exactly the same alerts are being analysed?

Imho it would be nice if we could have some configuration options to help decide what alerts to analyse. For example only open alerts, only alerts in a specific time windows, only host or only network alerts, maybe even allow us to select the specific alerts to analyse?

Best regards,

WillemD

Thanks for your feedback @willemdh!

It started by hitting the max content length of 8K tokens a lot.

Asked our Azure engineer to provide a model with more tokens.

In general, consider using one of the models in the Excellent category for Attack discovery in the Large language model performance matrix documentation.

The matrix is current as of 8.14.3, but FYI GPT-4o Attack discovery output will be improved via this PR targeting an upcoming release.

In the meantime I decreased the Alerts knowledgebase from 30 to 10.

The default of 20 alerts is typically under the token limits of the models in the matrix.

The number of input tokens consumed is influenced by:

  • The number of alerts sent as context (configured via the Knowledge base setting)

  • The fields sent in each alert (configured via the Anonymization setting)

  • The amount of data present in each field, which varies from alert-to-alert

It's therefore possible to send less data, and less data per-alert via customized settings. However it's worth noting that the current generation of LLMs are also limited by output tokens. Thus it's possible to reduce the size of the input by decreasing the number of alerts sent, or sending less data per alert, but reducing both may not be enough to overcome the output token limits of a model.

We're experimenting with potential improvements to Attack discovery that work around some of the token limitations of the current generation of models.

Most of the time we have between 250 and 400 alerts a day. How does it decide what alerts to analyse?

Attack discovery analyzes opened and acknowledged alerts from the last 24 hours. It includes up to n alerts (default: 20) as configured by the Knowledge Base setting.

The n alerts sent as context are sorted by risk score. Using the default of 20 as an example, the 20 alerts with the highest risk score in the last 24 hours will be sent as context.

I noticed it choose a lot of alerts triggering from the rule "Multiple Alerts in Different ATT&CK Tactics on a Single Host". After closing all these alerts they were still chosen by Attack Discovery.

The global date selector on the Security > Alerts page defaults to Today, which depending on when the alerts were generated, may include a different set of alerts than Last 24 hours. To approximate a view of the alerts sent as context for Attack discovery via the Alerts page:

  1. Select Last 24 hours from the Security > Alerts global date picker

  2. Filter the page for open and acknowledged alerts

  3. Sort the Risk Score column in the alerts table from High to low

  4. Open the Sort fields popover on the alerts table, and drag Risk score above @timestamp to sort first by Risk Score, and then by @timestamp

  5. Select Rows per page: 20 (or higher) to match the configured number of alerts in the Knowledge base setting

The above is illustrated by the screenshot below:

Other time I saw it choose 10 identical alerts. Imho it's not very useful when 10 exactly the same alerts are being analysed?

Imho it would be nice if we could have some configuration options to help decide what alerts to analyse. For example only open alerts, only alerts in a specific time windows, only host or only network alerts, maybe even allow us to select the specific alerts to analyse?

Thank you for sharing this. We're exploring improvements to Attack discovery that may include additional filtering of the alerts that serve as input.

1 Like

Thanks for the response, @Andrew_G !

@willemdh I'll just add that that the standard token limits these days is 128k and beyond. You're actually paying more for GPT-4 at 8k, then GPT-4o at 128K.

James

1 Like

Thanks (again) for all the info @Andrew_G and @jamesspi . I'll definitely test this and will experiment with other models asap.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.