Anomaly Detection Assistance

Kibana version: 8.6.1 (ECK)

Elasticsearch version: 8.6.1 (ECK)

APM Server version: Fleet 8.6.1 (ECK)

APM Agent language and version:

Browser version: Chrome 110.0.5481.104

Original install method (e.g. download page, yum, deb, from source, etc.) and version: ECK

Fresh install or upgraded from other version?: Fresh

Looking for some general guidance regarding machine learning jobs. I have an ECK cluster with APM accepting OTEL data from a collector.

My goal is, using machine learning, to detect anomalies/outliers (I'm not sure which is the correct route here) in a specific field within an APM index. For example, I'd like to be able to check via API whether the build of a product took longer than it typically does. So a build runs, and upon completion it checks elastic to determine whether that particular build ID was an anomaly.

This all sounds simple (and probably is), but I'm not sure what path I should be taking to do this. The data is getting to Elastic, but I'm not sure how to analyze it. Should I use an anomaly detection job? Should I use a data frame analysis job? Should I not be using either? I have been able to successfully set up ongoing anomaly detection, but from what I can tell, the anomaly detection works off a mean,median, etc, of a specified bucket. So I'm not sure how to go about determining if just one specific data point is an anomaly.

Hoping this all makes sense. I'm new to all this so let me know if this doesn't make sense and I'll try and clarify.

Thanks in advance.

Anomaly detection is the right approach for time-series data (metrics, logs, etc.). Yes, it uses bucketing, but the user has control over the size of the bucketing (see bucket_span).

Anomaly detection will not tell you if a "single" measurement is anomalous in time, unless that measurement happens to be the only one in the current bucket_span. Usage of the max function, for example, will approximate that since it self-selects the single largest measurement in the bucket_span.

1 Like

@richcollier Thanks - that makes sense about using max . To clarify though, even when using max there is no way to correlate that to a specific document, is that correct? So for example I run a build, I want to check if that build's final duration is an anomaly - is that not feasible?

You can always link from the Anomaly Detection results back to Discover to manually inspect the documents in that time bucket:

I do not have that option, it is greyed out. When I hover I get:

Unable to link to Discover; no data view exists for index 'traces-apm*'

Just manually create a Data View for that index?

1 Like

Got it - thank you.

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.