Anomaly detection for web site visitor surge (sparse data)

I'm trying to use elastic anomaly detection to identify when a surge in website visitor activity occurs for a particular IP address.

The data set is really simple, it just shows the number of times an IP address has visited the site each hour
_time,src,visits
2024-09-27T01:00:00.000-0400,199.58.35.150,15
2024-09-27T02:00:00.000-0400,199.58.35.150,25
2024-09-27T03:00:00.000-0400,199.58.35.150,20

So from Sunday September 22 through Sunday September 28, the activity above should be anomalous since the IP ending in .150 visited the site on September 27, but did not have any other visits earlier in the week.

The problem is that the data has too many missing documents to use High mean(visits) in the anomaly detection job, unless I insert billions of extra documents to make the value of visits=0 for every hour and every possible IP address.

Is there another way to make this ML anomaly detection use case work, without having to use a gap_policy or write custom code to fill gaps with zeroes?

Analyzing data from high-cardinality entities (like IP addresses) is tricky because:

  1. There are possibly 1M+ entities and that's not easily scalable to analyze every entity over all time
  2. Data may be sparse for entities.
  3. If a particular entity shows up for the first time and does something anomalous immediately, you cannot tell it that behavior is anomalous for that entity because you don't have any prior history of that entity to judge against.

Therefore, look to using Population Analysis instead