ML Anomaly Job with exclude_frequent option

Hello everyone,

i was reading through the docs and became curious

Say i create two detectors.
One detector is high_sum(a) over b
The other detector is high_sum(a) by c.

Now, if i define exclude_frequent = over for the first detector, then it will exclude frequently occurring values in b from triggering anomalies completely?

Consequently, it would make no sense to define exclude_frequent = over with the second detector high_sum(a) by c since over_field is empty?
Thus in case of high_sum(a) by c we would need to set exclude_frequent = by to make it exclude frequent values in c from triggering anomalies?

Am i understanding this correctly?

First and foremost, you should probably understand the major differences in using the over field versus using a split field like by or partition. Using the over field results in a population analysis (comparing entities against the population) which is much different than the normal temporal analysis (comparing an entity against its own history).

See: Temporal vs. Population Analysis in Elastic Machine Learning | Elastic Blog

Secondly, the use of exclude_frequent predated the creation of Filters in Custom Rules, which give you more flexibility and control on what gets excluded.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.