Watcher execution

alerting

(GOPAL) #1

When there are multiple nodes in a cluster we need to have one watcher instance on each node ( please correct me if its not the case), in such a case how does the watcher distribute rule execution to distribute load?
Also, how does it ensure that same rule is not executed twice on two different nodes.
I couldn't get this in any docs clearly, hence the question.


(Joshua Rich) #2

You'll need to install the Watcher plugin on all nodes within your cluster. Watch scheduling and execution is done on the current master node, so you do incur a some CPU and memory overhead on your master node. Because Watcher only runs on the master node, this ensures only one Watcher instance is running within the cluster, so you can be guaranteed rules do not get executed multiple times.


(GOPAL) #3

Thanks Joshua
Well that answered one doubt but gave rise to another. What is the role of the watcher plugin in that case?
How does watcher then make sure that when a node fails execution happens on another node?

Thanks.


(Martijn Van Groningen) #4

When a watch gets triggered this gets stored in the .triggered_watches index and when the execution of a watch has completed the watch execution output gets stored in the current history index and the entry in the .triggered_watches index gets removed.

If the current elected master steps down (for whatever reason) Watcher stops on the current elected master node. When a new elected master node is elected and Watcher gets started, the first thing Watcher does is to start the execution of watches stored in the .triggered_watches index.


(system) #5