Say i create two detectors.
One detector is high_sum(a) over b
The other detector is high_sum(a) by c.
Now, if i define exclude_frequent = over for the first detector, then it will exclude frequently occurring values in b from triggering anomalies completely?
Consequently, it would make no sense to define exclude_frequent = over with the second detector high_sum(a) by c since over_field is empty?
Thus in case of high_sum(a) by c we would need to set exclude_frequent = by to make it exclude frequent values in c from triggering anomalies?
First and foremost, you should probably understand the major differences in using the over field versus using a split field like by or partition. Using the over field results in a population analysis (comparing entities against the population) which is much different than the normal temporal analysis (comparing an entity against its own history).
Secondly, the use of exclude_frequent predated the creation of Filters in Custom Rules, which give you more flexibility and control on what gets excluded.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.