Duplicates in Webhook

alerting

(Matthew) #1

Hey all,

I am configuring Watcher right now and am running into an issue. I have it all set up to forward messages to HipChat. It's working but every 5 minutes, it's sending 3 duplicate messages from old events. What i'm looking for is to be updated whenever our apache error_logs are written to. From my understanding, running the throttle period of 5m will trump the trigger schedule. Believe that the time range filter isn't working right because i'm being notified of events that happened hours ago. Would like to look back 5 minutes, every 5 minutes and alert if any new events come in. Thanks!

"trigger" : {
"schedule" : { "interval" : "10s" }
},
"input" : {
"search" : {
"request" : {
"indices" : [ "filebeat" ],
"body" : {
"query" : {
"filtered" : {
"query" : {
"match" : { "source": "/path/to/my/apache/error_log" }
},
"filter" : {
"range": {
"@timestamp" : {
"from" : "now-5m",
"to": "now"
}
}
}
}
}
}
}
}
},
"actions" : {
"notify-hipchat" : {
"transform" : {
"search" : {
"request" : {
"indices" : "filebeat",
"body" : {
"query" : { "match" : { "source": "/path/to/my/apache/error_log" } }
}
}
}
},
"throttle_period" : "5m",
"hipchat" : {
"account" : "notify-monitoring",
"message" : {
"room" : [ "monitoring-room" ],
"user" : "",
"body": "New error seen in api apache error_log!\n\nHost:{{ctx.payload.hits.hits.0._source.beat.hostname}}\nMessage: {{ctx.payload.hits.hits.0._source.message}}",
"format" : "text",
"color" : "red",
"notify" : true
}
}
}
}
}'


(Alexander Reelsen) #2

Hey,

first, please use proper formatting when pasting watches, this makes it much more readable for everyone. See the code block documentation.

Second, there is one expected behaviour, that you claim is an issue and one mystery.

Let's start with the expected behaviour. You are querying correctly in the input for events in the recent five minutes. However, before you want to display that data, you are querying again ins the transform part of the hipchat action, but this time without the date range filter. So what happens in the output is, that the first result is used to display, and that might not be in your time range. Why did you add this transform? It doesnt feel to me as if it's needed.

Now on the second issue of the multiple execution. Can you add the ID of the watch to the hipchat message body like this

"body" : "New error seen in apache log by watch {{ctx.watch_id}} ... <other stuff>"

This way you can see if this is the watch actually writing the data or there is another watch.

--Alex


(Matthew) #4

Alexander,

Thank you for the tip on posting code and with the query. I removed the transform as I thought it was needed for an action to take place. Running the watch now, it has picked up an error within the past 5 minutes and sent the notification to HipChat. I went into Kibana to verify and the error was indeed there. However, there were also two more within this time frame. I was wondering if there was something else that needed to change so that all events matching my query were sent or if this was just a fluke. Also, looking for recommendations on timings to kickstart the watch and send to our monitoring channel. ie: check every 5 minutes and look back 5 minutes for the matches, etc..

Thanks again!


(Alexander Reelsen) #5

Hey,

well, thats part of your action definition. You are using {{ctx.payload.hits.hits.0.source.beat.hostname}} which means you only want to access the very first entry of all your results. The 0 is an index number of an array. if you want to loop through your entries you can check out the elasticsearch search template for some more information about mustache. The question in general is, if it might be enough to just show aggregates in your hipchat messages (and thus use aggregations), but that obviously depends on your use-case.

--Alex


(system) #6