ML Anomaly Detection count of Processed Records

marmai16 · November 29, 2023, 1:27pm

Hello everyone,

i deployed several anomaly detection jobs with different query_delay parameters to evaluate which one fits best without losing documents or being to much behind real-time.

What's interesting here is that the number of Processed Records by the Anomaly Detection Job and the actual number of documents in the index differ such that Processed Records shows a higher number than documents present in the index (e.g. no. of docs in index equals 250 and Processed Records of the job is 265)

So my question is, what does Processed Records actually mean or imply ?
I thought that an index in regard to a certain time period is just queried once for new docs by the datafeed, thus there is no situation where a doc would be counted multiple times ?

Thank you in advance!

Edit: Some additional information.
The start time of the datafeed on each job is the same. The counts of Processed Records among the jobs with the different query_delay parameters begin to differ after some time being in real-time analysis.
However they all exhibit the correct timestamp in Latest timestamp of the most recent document in the source index.

system · December 27, 2023, 1:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.