"Cooldown" for watches

Is it possible to set some type of "cooldown" for the watches? Because if the interval is increased the reaction time would go up. So for example i want to watch a value and check if it was below a certain threshold for the last 5 minutes and i want to run the watch every minute to have a low reaction time. But when the watch was triggered i dont want to get an alert every minute but lets say every hour.

I think you may want to check the throttle period per watch or per action.

See https://www.elastic.co/guide/en/elastic-stack-overview/7.0/actions.html#actions-ack-throttle

1 Like

I already thought about that but the problem ist that if the value reaches the threshold a second time within the throttled time you wont get alerted. Considering this my best guess would be that you query the "trigger_event.triggered_time" from the ".watcher-history-*" index in a chained input in your watcher and make your query-range of that watcher depending on that time-value.
(It would be an approach little bit similar to: https://discuss.elastic.co/t/watcher-alerting-time-issues-with-frequent-watch/68710/3?u=6tubp3gwd9zp)
Would that be even possible? If so how?

I must be missing something, throttling is doing exactly what you said in your first post, by not triggering an action whenever the watch is run, but only once per hour (when set to that interval).

In your second post you want to be alerted, when you reach the threshold a second time, this contradicts the first statement to me. Can you clarify please?

Important note for understanding how throttling works within wtcher: Throttling also resets its state once the condition will be false again. So if within the throttling period the condition turns false, and then true with another run, the throttling period gets restarted and an alert will be sent.

1 Like

I agree my first post was a bit misleading.

Hope this image I drew makes it a bit clearer.
The reaction time is highly depending on the trigger interval so it should be small.
The query range should be equal to the throttle time else you miss out alerts or get alerted for the same event multiple times.
When you have a big query range the graph could reach the threshhold multiple times for every time i want an alert one time but not multiple times for the same event. So my idea was that the "from" timestamp of the query range (the left side of the box in the image) must be dynamically always the same time as the "last time fired".

Ok I think I solved it... Now the watcher first fetches the last time it was executed and then sets this time as the "must be greater than"-time_range border for its actual query. This way I get informed about every spike of my graph but not about the same spike twice. Also I don't become "blind" for more spikes because of the throttling because it is not needed now. Hope this is also valuable for someone else.

{
	"metadata": {...},
	"trigger": {...},
	"input": {
		"chain": {
			"inputs": [{
					"first": {
						"search": {
							"request": {
								"search_type": "query_then_fetch",
								"indices": [".watcher-history-*"],
								"types": ["doc"],
								"body": {
									"size": 0,
									"aggs": {
										"1": {
											"max": {
												"field": "result.execution_time"
											}
										}
									},
									"query": {
										"bool": {
											"must": [{
													"match_all": {}
												}, {
													"range": {
														"trigger_event.triggered_time": {
															"gte": "{{ctx.trigger.scheduled_time}}||-30d",
															"lte": "{{ctx.trigger.scheduled_time}}",
															"format": "strict_date_optional_time||epoch_millis"
														}
													}
												}, {
													"match_phrase": {
														"metadata.name": {
															"query": "Watcher_name"
														}
													}
												}, {
													"match_phrase": {
														"state": {
															"query": "executed"
														}
													}
												}
											],
											"filter": [],
											"should": [],
											"must_not": []
										}
									}
								}
							}
						}
					}
				}, {
					"second": {
						"search": {
							"request": {
								"search_type": "query_then_fetch",
								"indices": ["index*"],
								"types": ["doc"],
								"body": {
									"size": 0,
									"aggs": {
										"1": {...},
										}
									},
									"query": {
										"bool": {
											"must": [{
													"match_all": {}
												}, {
													"range": {
														"@timestamp": {
															"gte": "{{ctx.payload.first.aggregations.1.value_as_string}}",
															"lte": "{{ctx.trigger.scheduled_time}}",
															"format": "strict_date_optional_time||epoch_millis"
														}
													}
												},
											],
											"filter": [],
											"should": [],
											"must_not": []
										}
									}
								}
							}
						}
					}
				}

			]
		}
	},
	"condition": {...},
	"actions": {...}
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.