See this for explanations of query_delay, frequency, and bucket_span: Are these values right for Query delay, Frequency and Bucket Span? - #4 by richcollier
In short, query_delay is what lags the entire job behind "real-time". If you only ingest data once per day then indeed, you will need to lag your job with a query_delay of at least 1 day. It also matters what the bucket_span of your job is.
Keep in mind that the anomaly detection jobs can either be running in real-time (with a delay, of course) or they could be invoked periodically (with a script that hits the datafeed API with a start
and end
time, for example) to process previously ingested documents.
What you DON'T want is for the Anomaly Detection job to search for data in the ES index for a certain time range, but have no documents in the index because they are not ingested yet.