I use the SLM policy, and the start time of the snapshot is inconsistent with the scheduled time

I use slm policy to create a snapshot policy plan 0 0 * * *?, The next plan is 12:00, but the actual start time of the snapshot is 11:59:59, one second ahead of schedule. Or 12:00:01, delay 1 second. There are three replicas of esmaster and three data nodes.

Hello @lijianzhi , welcome to the community !
The snapshot policy, or essentially the snapshot task is scheduled at ES master node and should start at the scheduled time. The variations here are probably due to the system time of the host where ES is running against UTC timezone.

1 Like

Thank you for your answer. However, my server does time synchronization, and I deployed a single node verification, which also caused this problem. The start timestamp of the snapshot is also inconsistent with the scheduling timestamp of the policy

I am not surprised that it sometimes shows a later timestamp, and that's not something we'd consider a bug. But I do not think it should ever run early. From looking at the code I don't think it does actually run earlier than scheduled, although I do see a way for it to pick up a timestamp that's slightly earlier than the actual start time which would cause what you're seeing.

Thank you, but I have verified it many times. I created a snapshot policy that is scheduled every 15m, most of which are scheduled 1 second in advance, others are scheduled accurately

Why is this a problem?

Yes, to clarify my previous message I can see a way for the timestamp to be slightly earlier than the actual start time of the snapshot. You can report this as a bug on Github if you'd like.

Thank you very much!Can you tell me which method caused this problem?

The issue is that org.elasticsearch.xpack.core.scheduler.SchedulerEngine#clock is the (accurate) java.time.Clock#systemUTC, whereas the snapshot timestamp comes from org.elasticsearch.threadpool.ThreadPool#absoluteTimeInMillis which is updated by org.elasticsearch.threadpool.ThreadPool.CachedTimeThread and therefore by default can be slow by 200ms or more:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.