I'm trying to use elastic anomaly detection to identify when a surge in website visitor activity occurs for a particular IP address.
The data set is really simple, it just shows the number of times an IP address has visited the site each hour
_time,src,visits
2024-09-27T01:00:00.000-0400,199.58.35.150,15
2024-09-27T02:00:00.000-0400,199.58.35.150,25
2024-09-27T03:00:00.000-0400,199.58.35.150,20
So from Sunday September 22 through Sunday September 28, the activity above should be anomalous since the IP ending in .150 visited the site on September 27, but did not have any other visits earlier in the week.
The problem is that the data has too many missing documents to use High mean(visits) in the anomaly detection job, unless I insert billions of extra documents to make the value of visits=0 for every hour and every possible IP address.
Is there another way to make this ML anomaly detection use case work, without having to use a gap_policy or write custom code to fill gaps with zeroes?