Machine learning use case - Anomaly Detection

Hi Everyone,

I'm currently developing and testing a Machine Learning-based use case in Elastic for anomaly detection. I've cloned and configured the ML job "auth_rare_hour_for_a_user", which is designed to detect user logins at unusual hours. The job is active and appears to be functioning correctly, with all relevant log sources added to the data feed.

In Kibana, I’ve also configured a "Security Alert rule" using this ML job. The rule is set up as follows:

Rule Name: "auth_rare_hour_for_a_user"
Description: Detects user logins at times that are unusual for the user, which may indicate credentialed access via a compromised account or unauthorized activity during non-business hours.

To test the setup, I performed several login activities during off-hours that should be considered anomalous. However, no alerts have been triggered so far.

Could anyone advise if I might be missing a step or configuration detail for ML-based detection rules like this? Are there specific thresholds, lookback periods, or job configurations I should double-check?

Any insights or suggestions would be greatly appreciated.

Thanks in advance!

Thanks for reaching out, @sunith. I have a few follow-up questions here:

  • What version of Elastic are you using?
  • Could you provide more information about how this alert is configured? Attaching a picture of the configuration could also be helpful.

Best,

Jessica

Thanks for the reply @jessgarson

Elastic version 8.18.3

Screenshot of the ML rule and associated job

Thanks for following up, @sunith. One thing that has tripped me up with anomaly detection in the past is that it requires a sufficient amount of baseline data (typically 2-4 weeks) to establish standard patterns before detecting anomalies reliably. Could this be what's happening here?

Thanks for the reply. Yes, for baseline data I gave 1 month of data to this job.

Thanks for following up, @sunith. As a next step, please verify that the bucket span is appropriate for your use case (typically 15 minutes to 1 hour for login patterns).

It is also important that the baseline includes many examples of "normal" login times for the particular user you are testing. You should have dozens or even hundreds of "normal" examples of logins for that user before you attempt to inject an "anomalous" one.

If there aren't many (or worst case, ANY) examples of "normal" login times for that user your are testing, then the login events you inject in testing won't have enough to contrast against to determine unusualness.