ML : detects unusually low number of users

AmS · October 30, 2023, 3:54pm

Hello ,
I'm using machine learning detector on version 7.17.
I would like to detect unusually low number of users.
I m using for that low_distinct_count as function.
It works fine when at least there is at least one user .
Exemple

if the number of user goes from 17 to 1 , this decrease is well detected
if the number of user goes from 17 to 0 , this decrease is NOT detected

Are there any condition to add to the ML job so we can detect event if all users are lost?

Thanks

richcollier · October 30, 2023, 5:30pm

This, in fact, should work as you are expecting it to. I'd love to see evidence (like a screenshot) of the situation that you describe with it not working! Please post here!

AmS · October 30, 2023, 10:01pm

Here the two cases

it works when at least one user is present

image1463×241 34.2 KB

-it does not work there is no users

image1459×228 38.1 KB

as you can see in the second graph detection of the decrease starts on 28/10 rather then the 26/10 like the first one.

FYI , I'm using a detector like " low_distinct_count(ID) by XYZ "

richcollier · November 2, 2023, 6:05pm

I stand corrected - it does ignore empty buckets, which is a little counter-intuitive to me but apparently, that's how it was designed. I've asked dev to consider making a feature enhancement to make the behavior optional (like we do by having count and non_zero_count function variants).

In the meantime, this can be accomplished via a workaround.

Use aggregations in the datafeed to calculate the cardinality of your field of choice. See examples here: Aggregating data for faster performance | Machine Learning in the Elastic Stack [8.10] | Elastic
Use the low_sum detector function on the aggregated field name.

AmS · November 3, 2023, 11:18am

Ok ,
Low_sum is not adequate since that the ID is not agregable field.
I will try the agregation on the datafeed

richcollier · November 10, 2023, 6:49pm

I got it to work via the cardinatlity agg and the low_sum function:

job config:

Note the name of the cardinality agg (here it is dc_airline) is the same as what's used in the detector definition (low_sum(dc_airline)) and the value of the summary_count_field_name

AmS · November 20, 2023, 8:39am

Thanks for your answer ,
I didn't succeed to have the expected result.
I have done the same thing , But I need to add a field XYZ as influencer:

in the result of the job I Don't have XYZ :

richcollier · November 20, 2023, 1:55pm

If you intend to split by using a by_field or a partition_field then your datafeed query has to also include a terms aggregation so that you get a service_cardinality value for every XYZ.

I suspect you don't have that at the moment

See example

system · December 18, 2023, 1:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ML Anomaly Detection jobs gives very low score for absense of events Elasticsearch elastic-stack-machine-learning	5	47	October 18, 2024
Question on how to choose the aggregations for ML job Kibana elastic-stack-machine-learning	3	363	December 10, 2019
Anomaly Detection Kibana skipping data Kibana elastic-stack-machine-learning	12	895	July 15, 2020
[ML] Custom function in anomaly detection job Kibana elastic-stack-machine-learning	7	483	March 25, 2023
Machine Learning low sum is not working as expected Kibana elastic-stack-machine-learning	2	353	October 25, 2022

ML : detects unusually low number of users

Related topics