I have configured watcher to trigger alerts in pagerduty. If a service comes back online automatically,I would expect the pagerduty to resolve the alert. Can someone help me how we can achieve that
Hello @SowmiyaRS ,
I didn't understood your issue entirely. Assuming your intention is to monitor service through watcher and having field in indices like service.status:UP or DOWN this can be done.
Below is my script how I'm monitoring tomcat down status and index I have field: tomcat.serverStatus: UP/DOWN coming every 20 seconds.
Not sure same is your intention.Just check the script to trigger according to your specific condition.
{
"trigger": {
"schedule": {
"interval": "10s"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"mis-monitoring-webserver-*"
],
"rest_total_hits_as_int": true,
"body": {
"aggs": {
"host": {
"terms": {
"field": "host.name.keyword",
"order": {
"_key": "desc"
},
"size": 10000
},
"aggs": {
"port": {
"terms": {
"field": "tomcat.port",
"order": {
"_key": "desc"
},
"size": 10000
},
"aggs": {
"application_name": {
"terms": {
"field": "tomcat.application_name",
"order": {
"_key": "desc"
},
"size": 10000
},
"aggs": {
"status": {
"top_hits": {
"fields": [
{
"field": "tomcat.server_status"
}
],
"_source": false,
"size": 1,
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}
},
"size": 0,
"query": {
"bool": {
"filter": [
{
"script": {
"script": """if (doc['tomcat.server_status.keyword'].size() != 0 )
{
def a=doc['tomcat.server_status.keyword'].value;
if (a=='DOWN')
{
return true;
}
else
{
return false;
}
}
"""
}
},
{
"range": {
"@timestamp": {
"gte": "now-15m",
"lte": "now"
}
}
}
]
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"email_1": {
"email": {
"profile": "standard",
"to": [
"prashant.mehta@xyz.com"
],
"subject": "TOMCAT DOWN STATUS",
"body": {
"html": """
{{#ctx.payload.aggregations.host.buckets}}
<p><b>Summary:</b> Tomact is down on server {{key}}</p>
<p><b>Date and Time:</b> {{ctx.trigger.scheduled_time}}</p>
<p><b>Description:</b>Tomcat is down with the following details</p>
{{#port.buckets}}
- Port: {{key}}, Applications: {{#application_name.buckets}}{{key}}<br>
<p><b>Status:</b> DOWN</p>
{{/application_name.buckets}}{{/port.buckets}}
<p><b>Issued By:</b> CIS Monitoring System</p>
<hr />
{{/ctx.payload.aggregations.host.buckets}}
"""
}
}
}
}
}
My requirement is if the pagerduty alert gets triggered when the service got crashed for a minute and if the service got restored on its own in the next minute,watcher should resolve the pagerduty alert that was created