Hi,
I'm trying to setup a few watches that check for more than X 5xx apache errors on my servers during the last y period, based on logstash inputs. I'm trying to find the best strategy to be able to do the check by host instead of for all host.
My goal is to avoid having to create a watch by node and to be able to receive an email alert, split by host, with the error code and the problematic request.
What should I use to facilitate my life ? Faceted search ? Aggregation ? How can I use that to write my condition ? How can I do the split in the body of my email?
How are you handling that kind of problematic?
Thanks in advance for your insight.
Maxime
Here is my current watch.
PUT _watcher/watch/apache_500_error
{
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"search": {
"request": {
"indices": [
"logstash-apache-*"
],
"body": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and": [
{
"numeric_range": {
"response": {
"gte": 500,
"lt": 600
}
}
},
{
"range": {
"@timestamp": {
"gte": "now-1h",
"lte": "now"
}
}
}
]
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 20
}
}
},
"throttle_period": "1h",
"actions": {
"send_email": {
"email": {
"from": "from@mydomain.com",
"to": [
"to@mydomain.com"
],
"subject": "{{ctx.payload.hits.total}} Errors on apache during the last hour",
"body": {
"html": "Error are on these urls : <br/> <table><tr><td>timestamp</td><td>error code</td><td>request</td></tr>{{#ctx.payload.hits.hits}}<tr><td>{{_source.timestamp}}</td><td>{{_source.response}}</td><td>{{_source.request}}</td></tr>{{/ctx.payload.hits.hits}}</table>"
},
"attach_data": true
}
}
}
}