Learning to create watches

Hi everyone. I set out to create a really simple alert, when the CPU idle percentage drops below 20% I want an alert. For simplicity I chose the logging action and metric beats is running on a few servers and posting data.

I found this really difficult! I'm a seasoned developer and I'm using the ElasticSearch reference pages but it still took a looooong time to get this to begin to function, I'll paste the PUT request I finally came up with but I have some questions:

  1. Are there plans to create a UI for this?
  2. The docs say to use ctx.payload.hits.hits.0.fields.theThingIWant, didn't work for me. Is the syntax I have with blah.hits.0._source a bad idea?
  3. This get's the most recent result where the idle %age is less than 0.2, but what if three servers were running hot. How would I change this to create alerts for all three? I can change the size property and I guess a date range would be a good idea but how do I deal with a collection of results in the action section?
  4. Is there a good tutorial for system vitals monitoring like this?
  5. Is this too many questions in one post?

Here's the PUT request:

PUT _xpack/watcher/watch/CPU_spike
{
  "trigger": {
    "schedule": {
      "interval": "10s"
    }
  },
  "input": {
    "search": {
      "request": {
        "indices": [
          "metricbeat-*"
        ],
        "body": {
          "size": 1,
          "sort" : { "@timestamp" : "desc" },
          "query": {
            "exists": {
              "field": "system.cpu.idle.pct"
            }
          }
        }
      }
    }
  },
  "condition" : {
    "compare" : {
      "ctx.payload.hits.hits.0._source.system.cpu.idle.pct" : {
        "lte" : 0.2
      }
    }
  },
  "actions": {
    "log": {
      "logging": {
        "text": "{{ctx.payload.hits.hits.0._source.beat.name}} is only idle at {{ctx.payload.hits.hits.0._source.system.cpu.idle.pct}}"
      }
    }
  }
}

Hey there,

so, let's try to work on the questions, not necessarily in originating order :wink:

  • Question 1: yes
  • Question 5: maybe
  • Question 2: where in the docs is that stated? Usually you are good to go with the _source notation. We should think about updating the docs then.
  • Question 3: You may want to take a look at aggregations and how to create min/max/avg values on a per-server base
  • Question 4: The first thing to understand is, that everything depends on your data and if you are able to write a query to extract the data you need, then it is also easy to write a watch for this. This also ties a bit to question 3.

Hope this helps as a start.

--Alex

Hi Alex,

  1. \o/

  2. I was reading the information here for version 5.1. Glad to know that _source is a good option though.

To get a field value from a particular hit, use ctx.payload.hits.hits..fields.. For example, to get the message field from the first hit, use ctx.payload.hits.hits.0.fields.message.

  1. Thanks, I'll note that.

  2. I'm playing with MetricBeats stats, assuming the correct options are configured in the .yml file then it'd be useful to have a guide that creates alerts with those metrics. I do have the sample dashboards for MetricBeats and PacketBeats and they're a great help. Sample alerts that uses this same data would be really useful.

Thanks for your answer :slight_smile:

Gog

Hey,

a nice place to start might actually be to check the dashboards you created, because you can extract the query that was executed from each visualization and go from there.

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.