You chose a bucket_span of 5s
, which I can see in your screenshot. This is why I mentioned it. I'm not saying that this is the correct thing to choose - only just noticed that is what you set
The bucket span is the window of time over which your data is aggregated. So, if you say:
sum(V2A7)
with a bucket_span of 5s
then all observed values of field V2A7
are summed in little 5 second windows over time and that summed value is modeled with ML.
So for example let's imagine a simplified data set:
time, V2A7
00:00:01, 5
00:00:02, 5
00:00:04, 5
00:00:06, 4
00:00:07, 3
00:00:08, 3
00:00:09, 2
00:00:11, 5
00:00:12, 5
00:00:14, 5
...
With a 5s bucket_span and a sum()
aggregation, the above data is summed up into 5s intervals:
00:00:00, 15
00:00:05, 12
00:00:10, 15
...
ML then learns these values over time (let's say, for example, that the above value of 12 to 15-ish is usual and repeatable over time). Then, at some later time, the following values occur:
09:11:00, 15
09:11:05, 240
09:11:10, 15
...
The value of 240
will be seen as unusual since it is very different than the typical sum()
values (which are around 12 to 15)
You should choose your bucket_span, however, with the tips from the blog. In 99% of the cases in machine data, the value of bucket_span will likely be measured in minutes, not seconds.