Just to give you some additional insight into what we are doing. In terms of how we are identifying anomalous time periods: we are creating a prediction for the new bucket value, up to some uncertainty. We take particular care to try and also model this uncertainty accurately. Anomalies correspond to values which are unlikely to be drawn from this predicted distribution. This prediction makes use of a number of features over different timescales, some of which we test for, some of which we include by a type of model averaging.

In fact, at one point we based the rate at which we learn on the bucket count (whatever the bucket length). This in practice produced not such good results at startup for short bucket lengths. This is because 99% of the signals we saw have features over longer time spans, which are important for making predictions about the next values. One way we now deal with this is to actually reduce the amount by which we will narrow the prior distribution on various model parameters if the bucketing interval is short.

In terms of sources of delays, we delay the period for model selection at startup to deal with the standard issues one hits with BFs for non-informative priors. Also for aggregate metrics whose distribution depends on the sample count, i.e. things like mean, min, max, we arrange to sample the data in (as close a possible) fixed measurement counts per sample. We therefore take some time to work out what are the typical rate of records per bucket interval, so our chosen sample count is close to the mean rate of messages per bucket. This process can be delayed if we see significant variation in the data rates. We roll up some of this into a blanket 2 hours minimum for individual analysis, independent of bucket length. It can be longer. For population analysis, when the detector configuration includes an "over" field, if we see many individuals per bucket interval we are getting information at a faster rate and therefore we can generate results sooner.

Obviously, this is the minimum time to generate anomalies. We can't learn long time scale effects, like weekly periodicity, a slow trend, etc this quickly. In this context we try and be conservative, i.e. we try and arrange that failing to capture some important effect early in the model lifetime produces a blind spot rather than false positives.

Finally, we do have feature requests around modelling data with very short time scale features, particularly things with sub-second duration, so this is an area we are considering enhancing.