Scheduled Watches not Triggering

alerting

(C Campbell) #1

I have 3 Scheduled Watches running on my 3-Node ES cluster. Since I took the nodes down for patching at the start of December, the watches no longer fire at 6 AM as scheduled. All three watches show the status of "Firing" and have since the server was restarted. I can use the test function to execute the watches and they perform as expected, but I would like to get these firing automatically again.

What steps can I take to troubleshoot this problem?


(Michael Basnight) #2

Hi, sorry to hear watcher is not happy! what version of elasticsearch are you running? Can you get me the output of the watch history for these watches? Also can you get me the status of watcher, GET _xpack/watcher/stats? You may want to try to _stop and then _start watcher as well, if you have not yet.


(Alexander Reelsen) #3

can you also report which version of Elasticsearch you are using?

Can you also share the watcher history if they have been triggered at all?

Thank you!


(C Campbell) #4

I've tried stopping and starting watcher this morning, which may have started executions again. I moved one of my scheduled reports to run at noon today to see if that executes as I expect. Thanks,


(C Campbell) #5

I'm using v6.4.0 (for the full stack). I looked at .watcher_history and it looks like nothing had been triggered since my maintenance work on the cluster at the start of December. There were a few attempted executions like this:

watch_id:xJSF3zE-RBCV5j2uI5XEPA_kibana_version_mismatch node:Jwyj0hcGQeSlTvIUUXViAQ state:not_executed_already_queued trigger_event.type:schedule trigger_event.triggered_time:December 28th 2018, 08:42:51.319 trigger_event.schedule.scheduled_time:December 28th 2018, 08:42:51.124 messages:Watch is already queued in thread pool _id:xJSF3zE-RBCV5j2uI5XEPA_kibana_version_mismatch_04c8d92e-e7c8-4431-b694-1c8a1db37802-2018-12-28T13:42:51.319Z _type:doc _index:.watcher-history-9-2018.12.28 _score: -

But none of the other system watches seem to have executed. Since doing a stop and start on watcher, it looks like the system watches are executing so I'm waiting to see if the problem is resolved.


(C Campbell) #6

FYI -
My test watch executed successfully now, so I'll see if the other watches run in the morning. Thanks for the simple guidance, I wasn't aware of how to stop and start watcher to get things running again.


(Alexander Reelsen) #7

Hey,

we fixed an important issue regarding watch execution on 6.4.1 which might have been causing the behaviour you have been mentioning. I'd urge you to upgrade at least Elasticsearch to 6.4.1 and then check if the issue still happens.

The PR in question is https://github.com/elastic/elasticsearch/pull/33167

--Alex


(C Campbell) #8

I have a partial success - the watch that I edited to adjust the execution time (to test yesterday) successfully executed as scheduled this morning, the other one that was scheduled for this morning did not run. I've tried editing a minor portion of the watch to see if resaving the watch will allow it to execute tomorrow morning.


(C Campbell) #9

Unfortunately I won't be able to do an upgrade on this system for a bit of time, we have too many more critical projects looming right now. I will try to get up to 6.5 as soon as I can manage the time.


(C Campbell) #10

This issue is resolved now. After stopping and starting Watcher, I needed to edit and save the watches that showed as "Firing" to get them to begin executing on schedule again.


(system) closed #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.