If each data point represents a minute for example, theoretically- the anomaly of the first mertic, can be 1 minute after the anomaly of the second metric.
I'm not a statistician, but I think that you should use a formula to find correlation.(e.g.pearson correlation)
Pearson Correlation tells you how related (in a linear sense) two variables are on average. You need many observations (irrespective of time). This doesn't make sense in the context of time-series based data where "correlation" really means that something co-occurs in time. Keep in mind that we bucket the data in time (hence the meaningfulness of the bucket_span parameter).
In your case, you have 3 metrics where the 3rd is the sum of the first two, so naturally, you will get time-correlation (I disagree with your assertion that there is a 1 sample delay of anomalousness). Looking at the anomaly scores of the 3 time-series will allow you to infer causality. In other words, if metric3 = metric1 + metric2 then when metric3 is odd it is very likely that either metric1 and/or metric2 will also be odd. The scores of metric1 and metric2 are a proxy for which is the most responsible.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.