Latest Timestamp Machine learning doesn't change

See this for explanations of query_delay, frequency, and bucket_span: Are these values right for Query delay, Frequency and Bucket Span? - #4 by richcollier

In short, query_delay is what lags the entire job behind "real-time". If you only ingest data once per day then indeed, you will need to lag your job with a query_delay of at least 1 day. It also matters what the bucket_span of your job is.

Keep in mind that the anomaly detection jobs can either be running in real-time (with a delay, of course) or they could be invoked periodically (with a script that hits the datafeed API with a start and end time, for example) to process previously ingested documents.

What you DON'T want is for the Anomaly Detection job to search for data in the ES index for a certain time range, but have no documents in the index because they are not ingested yet.

2 Likes