ML job not updating in real time

chrillebile · August 2, 2018, 7:57am

Hey, I'm having a strange problem.

I have an ML job (with a bucket span of 1 day) that are running in real time, but it has the "latest timestamp" on 2018-07-26 even though the latest data is from 2018-08-01. If I stop it and run it manually from 1 week before the latest timestamp till today it will still not update. If I create a new exactly same job it will get the latest timestamp of 2018-07-31, which still is not the latest. But if I try to clone the original job with the latest timestamp of 2018-07-26 then I will get the correct latest timestamp of 2018-08-01.

dmitri · August 3, 2018, 10:12am

Hi,

Could you paste the job configuration as well as the datafeed configuration please?
You can use the get-job API and get-datafeed API to do that.

Please also mention which version you are running on.

Finally, could you explain a bit more on how the data is ingested? Are there multiple documents per day or just one? Are they continuously indexed or in batches? If in batches, when are those batches indexed in a day?

chrillebile · August 3, 2018, 11:58am

Hey.

I'm running version 6.3.1.
There are added average 60(±40) documents per day, mostly during the night, but also during the day: Monday to Friday. They are not added in batches, sometimes there can go hours between and sometimes just a few minutes.
They are added to the same timestamp, depending on which version they are running. Ex. lets say we have a few documents inserted at maybe 2018-08-03-01:53:06, 2018-08-03-04:27:42 and 2018-08-03-03:12:31, the two first are added to the timestamp August 2nd 2018, 22:01:00.000 and the last one are added to the timestamp August 2nd 2018, 22:06:00.000.

dmitri · August 3, 2018, 1:01pm

OK, I am pretty sure that what's happening here is that the datafeed is advancing through those times and finds no data. The datafeed runs real-time but they way the data is indexed is not exactly real-time so the datafeed searches for a time range, sees no data and advances forward. You will probably need to adjust the datafeed frequency and query_delay parameters to work with the date manipulations you are doing. You can read more in Datafeed.

Topic		Replies	Views
ML Job not working with Live data Kibana elastic-stack-machine-learning	4	378	July 22, 2021
Machine learning jobs not reflecting new data Elasticsearch elastic-stack-machine-learning	5	837	October 30, 2018
ML datafeed with bursty data Elasticsearch elastic-stack-machine-learning	4	600	October 29, 2018
Latest Timestamp Machine learning doesn't change Kibana elastic-stack-machine-learning	6	697	April 16, 2021
Datafeed not happening in ml job Kibana elastic-stack-machine-learning	6	715	January 17, 2019

ML job not updating in real time

Related topics