Latest Timestamp Machine learning doesn't change

Elasticsearch: Platinum 7.10.1
Kibana: 7.10.1

Hi,

I am creating Machine learning jobs in real time. The questions are:
What kind of information Latest Timestamp provides?
and
Why the Latest Timestamp remains the same as it was in the first day I created it, even though the latest timestamp has changed inside the graph?
I

Thank you in advance!

/Angelos

Hi @Angelos,

The value presented in the Latest timestamp column is the timestamp of the latest document that was processed by the job.

Regarding the second part of your question, could you please clarify which graph are you referring to?

Hi David,

I meant the Single Metric Viewer. It seems like the Latest Timestamp there is changing but not in the overview.

/Angelos

When you say that the latest timestamp is changing in the Single Metric Viewer, are you referring to the time picker in the top-right or are you seeing results in the chart with a newer timestamp than what you see in the job list?

A screenshot would be helpful here. Thanks!

Sorry for the late reply I was working on this.

No the latest data that we have in the metric viewer:

Here it says March 12th 2021 but in the overview in the Latest Timestamp had a very old date.

I increased the query_delay and it seems to be better, however now I have (random)warnings saying that "Datafeed has missed a number of documents due to ingest latency ".

We ingest data every day at random time and I put the delay to 1d but then again I got the warning.
I don't really understand how it works.
How bigger should the query_delay be in order to avoid those warnings?

/Angelos

See this for explanations of query_delay, frequency, and bucket_span: Are these values right for Query delay, Frequency and Bucket Span? - #4 by richcollier

In short, query_delay is what lags the entire job behind "real-time". If you only ingest data once per day then indeed, you will need to lag your job with a query_delay of at least 1 day. It also matters what the bucket_span of your job is.

Keep in mind that the anomaly detection jobs can either be running in real-time (with a delay, of course) or they could be invoked periodically (with a script that hits the datafeed API with a start and end time, for example) to process previously ingested documents.

What you DON'T want is for the Anomaly Detection job to search for data in the ES index for a certain time range, but have no documents in the index because they are not ingested yet.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.