We are using Watcher to monitor the CPU utilization and send the alert email to the support team.
We found that the data format of CPU usage field is a decimal and not the percent, so we try to use Painless script to convert the format of this field. However, the updated Watcher rule can't work as expected. The error message in the elasticsearch log as below:
[2016-12-09T08:34:08,077][ERROR][o.e.x.w.a.e.ExecutableEmailAction] [elk5-es-poc-node-3] failed to execute action [Linux_High_CPU_Alert/send_email]. failed to transform payload. ScriptException[runtime error]; nested: NullPointerException;
So can anyone help us to check what's wrong in the below Watcher code? Many thanks for your help.
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"search": {
"request": {
"indices": [ "<metricbeat-{now/d}>" ],
"body": {
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gt":"now-15m","lt":"now"
}
}
},
{
"range": {
"system.cpu.user.pct":{"gt":"0.2","lt":"2"}
}
}
]
}
},
"aggs": {
"group_by_hostname": {
"terms": {
"field": "beat.hostname.keyword",
"size": 5
},
"aggs": {
"get_latest": {
"terms": {
"field": "@timestamp",
"size": 1,
"order": {
"_term": "desc"
}
},
"aggs": {
"range": {
"date_range": {
"field": "@timestamp",
"format": "MM/dd/yyyy",
"ranges": [
{ "to": "now" },
{ "from": "now-10m" }
]
},
"aggs": {
"group_by_cpu_pct": {
"terms": {
"field": "system.cpu.user.pct"
}
}
}
}
}
}
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"send_email": {
"transform": {
"script": {
"lang": "painless",
"inline": "ctx.payload.hits._source.system.cpu.user.pct = ctx.payload.hits._source.system.cpu.user.pct * params.percentage",
"params": {
"percentage": 10
}
}
},
"email": {
"to": [ "vinson@test.com" ],
"subject": "Watcher Notification - Found Server CPU High Utilization",
"body": {
"html": "<body><br><b>---------------------------High CPU Usage Alert---------------------------</b></br><br></br><br>The below Hosts have the high CPU usage. Please perform the OS checking accordingly!</br><br></br><br></br><table ><thead><tr><th>|</th><th>Hostname</th><th>|</th><th>Timestamp</th><th>|</th><th>CPU_Usage</th><th>|</th></tr></thead><tbody>{{#ctx.payload.aggregations.group_by_hostname.buckets}}<tr><td>|</td><td>{{key}}</td><td>| </td><td>{{#get_latest.buckets}}{{range.buckets.0.key}}</td><td>| </td><td>{{range.buckets.0.group_by_cpu_pct.buckets.0.key}}{{/get_latest.buckets}}</td><td>|</td></tr>{{/ctx.payload.aggregations.group_by_hostname.buckets}}</tbody></table><br><i>Report generated by ELK. This is system generated email, please do not reply.</i></br></body>"
}
}
}
}
}'