Anomaly detection in Machine learning kibana

MAHALAKSHMI_S · November 24, 2021, 7:34am

I created a "Categorization" based anomaly detection job and explored the job results in Kibana.
I have used "Payload"(i.e String) field for categorization.
Here I'm not sure what does "typical" value in anomaly explorer results signifies?

P.S I knew for any numerical feature, "typical" value signifies the median of those values. But not sure in case of string

droberts195 · November 24, 2021, 10:36am

With ML categorization jobs you still do an anomaly detection as well as a categorization. Usually this would use a function of the category ID. It's almost always rare by mlcategory or count by mlcategory, and since you don't know which you've got it must be one of these that's been added by the categorization wizard. You can find out by looking at the job configuration in the ML jobs list.

If it's count by mlcategory then your typical and actual will be how many categories of Payload typically and actually occur per time bucket. If it's rare by mlcategory then typical will be the probability of seeing that category in a typical bucket.

You can see the category definitions without the anomaly information using the Get Categories API.

MAHALAKSHMI_S · November 24, 2021, 11:01am

Thanks for your reply @droberts195 ..

I used count by mlcategory . However the actual field gives out the count of that payload category in the datafeed, which I felt totally different from what you said in the last post.
In addition to that typical field is in float type.

droberts195 · November 24, 2021, 11:20am

What that information is saying is that for category 8 there were 1330 documents on 26th December 2021 that were classified as category 8. On average there are 221.6 documents per bucket in category 8. The reason typical is a float is because the expected value of a distribution of integers isn't always an integer. For example, the expected value from rolling a standard 6-sided dice is 3.5, but you'll never roll 3.5 on a single roll of the dice.

MAHALAKSHMI_S · November 24, 2021, 1:43pm

Thanks @droberts195 , for clarifying this.

In case of population analysis with bucket span of 1 hour, it would be very helpful if you can explain about the typical value here.
In the screenshot attached, typical value remains the same for all anomalous messages. Why is it so?

droberts195 · November 24, 2021, 3:03pm

For a population job anomalies are created for entities that are significantly different to the population within the current time bucket. So the average count for the population in that time bucket is 2.27, and those two entities have much higher counts.

MAHALAKSHMI_S · November 25, 2021, 5:28am

Is it possible to get all buckets and count of documents in each bucket for category 8?

system · December 23, 2021, 5:29am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Anomaly Detection Categorization: Kibana not showing all ml category Kibana elastic-stack-machine-learning	7	400	May 20, 2022
Anomaly Detection Categorization: Kibana Signs used for Severities(warning, minor, major, critical) Kibana elastic-stack-machine-learning	5	573	May 24, 2022
Anomaly Detection Categorization: Kibana Severity vs Anomaly score Kibana elastic-stack-machine-learning	2	438	May 23, 2022
Kibana anomaly explorer Kibana elastic-stack-machine-learning	3	550	December 15, 2020
Category examples not available in Machine Learning module 5.5 Elasticsearch elastic-stack-machine-learning	4	958	September 27, 2017

Anomaly detection in Machine learning kibana

Related topics