ACK Watch API

alerting

(Aviral Srivastava) #1

I have created a watcher on packetbeat which sends email when http.code goes at or above 10 in the last 1 minute.
{
"trigger" : { "schedule" : { "interval" : "10s" } },
"input" : {
"search" : {
"request" : {
"indices" : [ "packetbeat-*" ],
"body" : {
"query" : {
"bool" : {
"must" : [
{
"match" : { "beat.hostname" : "pp-pp" }
},
{
"match" : { "http.code" : 404 }
}
],
"filter" :{
"range" : {
"@timestamp" : {
"gte" : "now-1m",
"lte" : "now"
}
}
}
}
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gte" : 10 }}
},
"actions" : {
"email_admin" : {
"email": {
"to": ["aviral.srivastava@company.com"],
"subject": "AppMon2.0: Application Test down!!!",
"body": "Dear user, It is found that the application Test have more than or equal to 10 http status error codes. Take a look at it."
}
}
}
}
I acknowledge watch like this:-
http://localhost:9200/_watcher/watch/packetbeat_watcher/_ack

According to my understanding of ACK Watch API, it prevents the execution of action[sending email] more than once until the condition remains satisfied.

My question is when I am acknowledging the watch just after watch creation.then also I am getting repeated emails.

But when I am acknowledging the watch when the condition turns to true. then I get only one email. But in this scenario too, when the condition turns to false then true again. Then again I am getting multiple emails.

So, when can we acknowledge the watch and do we need to acknowledge the watch again and again. everytime the watch condition turns true just after it was false.


(Alexander Reelsen) #2

Hey,

not sure I am following you here. Let me explain how throttling works in watcher and if there are still discrepancies in your setup, lets try to reproduce this more exact.

First, you should fix your watch input. Instead of just searching for 404 errors, you also have to supply a timestamp range as otherwise you will always get a positive result, if you get 10 errors since you started logging.

Second, there is the option of throttling in watcher. By default, the throttle timeout is set to 5s. So if you dont set an own throttle timeout and store this watch, you will get an email every 10 seconds. This is the time based throttling.

On top there is the acknowledge based throttling. An acknowledged watch action remains in the acked state until the watch’s condition evaluates to false. This is the intented behaviour.

I have the feeling that you do not need throttling but more concise query to solve your problem, but maybe you can elaborate (which exact behaviour you need) and we can work it out.

--Alex


(Aviral Srivastava) #3

Thanks for the response Alex,

  1. First I have specified timestamp range of last 1 min for checking Http errors.
  2. Throttling a watch, may not satisfy the objective as the time for which condition remains true varies everytime. So we are uncertain how much time we should specify for throttle_period.
    2a. If we acknowledge the watch instead of throttling, then do we still get emails every 10s. Which i am getting
  3. You said on top there is acknowledgement based throttling, does it mean we need to have throttle_period before acknowledging watch

(system) #4