Hi everyone. I set out to create a really simple alert, when the CPU idle percentage drops below 20% I want an alert. For simplicity I chose the logging action and metric beats is running on a few servers and posting data.
I found this really difficult! I'm a seasoned developer and I'm using the ElasticSearch reference pages but it still took a looooong time to get this to begin to function, I'll paste the PUT request I finally came up with but I have some questions:
- Are there plans to create a UI for this?
- The docs say to use
ctx.payload.hits.hits.0.fields.theThingIWant
, didn't work for me. Is the syntax I have withblah.hits.0._source
a bad idea? - This get's the most recent result where the idle %age is less than 0.2, but what if three servers were running hot. How would I change this to create alerts for all three? I can change the
size
property and I guess a date range would be a good idea but how do I deal with a collection of results in the action section? - Is there a good tutorial for system vitals monitoring like this?
- Is this too many questions in one post?
Here's the PUT request:
PUT _xpack/watcher/watch/CPU_spike
{
"trigger": {
"schedule": {
"interval": "10s"
}
},
"input": {
"search": {
"request": {
"indices": [
"metricbeat-*"
],
"body": {
"size": 1,
"sort" : { "@timestamp" : "desc" },
"query": {
"exists": {
"field": "system.cpu.idle.pct"
}
}
}
}
}
},
"condition" : {
"compare" : {
"ctx.payload.hits.hits.0._source.system.cpu.idle.pct" : {
"lte" : 0.2
}
}
},
"actions": {
"log": {
"logging": {
"text": "{{ctx.payload.hits.hits.0._source.beat.name}} is only idle at {{ctx.payload.hits.hits.0._source.system.cpu.idle.pct}}"
}
}
}
}