Alert on SUM of field exeeding certain value?

Hi,

I'm new to X-pack and have a question about watches.

I'm using ELK to monitor the status of a satellite based internet connection we have at various locations. I would like to know if I can use a watch to sum up the total bytes transferred since the start of the month and send an email alert if the sum exceeds a predefined value in the script?

I had a quick look at the user guide but it isn't immediately clear to me if this is possible.

So something like;

Schedule: Every 3 hours
Time range: 1st day of this month until now
where host:x.x.x.x (used to identify location by IP)
If SUM total.bytes > 10GB then send alert email

If possible I would like the automated email to include the host IP and SUM as well but that is not so important at this point.

Hey,

this sounds possible to me. The important part here is to write a proper search query (independent from the watch). The search query needs to do the following

  • have a range filter in your query that filters from the first of the month until now
  • have a term filter in your query (or a match query), that filters for the host you are interested in
  • have a sum aggregation in your request, that counts up your bytes (use the sum aggregation

Now you got the correct data (a single aggregation response returning a number), which you can use in the watcher condition to check if it exceeds a threshold.

If it does, send an email - where you can include this exact data.

One last thing: If the IP is dynamic, you could just have an aggregation for the ip address, and then calculate the sum for each IP.

Hope this helps.

--Alex

This is difficult. I know its my own incompetence but I'm getting nowhere with this. It's all syntax errors, input type errors and whatnot.

Is there any detailed documentation or tutorials that explain how to build a watch from scratch on a idiot level? I'm sure the guide makes sense if you have experience dealing with this sort of thing but for me one line of text explaining a function and a wall of code don't help me understand how to deal with this.

Hey @Sjaak01

Can you copy and paste what you have so far, and what errors you are getting so I can provide more direct help?

-Bohyun

Hi @bohyun

I got this. There are syntax errors and I'm sure the way I've written the query, some parts probably aren't even possible.

{ "trigger": { "schedule": { "interval": "30m" } }, "input": { "search": { "request": { "body": { "size": 0, "query": { "range" : { "date" : { "gte" : "now-%2FM", #Range needs to be this month, copied this from the Dashboard view as I couldn't find "this month" code on the documentation page "lt" : "now" } } }, "query": { "match": { "host" : { "query": "1.1.1.1", #Eventually I want this to scan a range of IP's e.g. 1.x.x.x, 2.x.x.x and perform the query on each unique IP found #and check usage for each IP "type": "phrase" } } } }, "indices": [ "netflow-*" ] } } }, "aggs" : { "Total_bytes" : { "sum" : { "field" : "netflow.total_bytes" } } } }, "condition": { "compare": { "netflow.total_bytes": { "gte": 10MB #Condition should also check whether action has already run in the past 24h, if so then don't send another email until atleast 24h has passed #to avoid getting spammed with emails. } } }, "actions": { "my-logging-action": { "logging": { "text": "Monthly usage is {{netflow.total_bytes}} . Threshold is 10MB." #Will swap the logging action for an email action once the rest of the watch is working } } } }

Okay I think I'm getting there.

I have a search request that gives the desired value (not perfect but this will do for testing).

GET _search? { "query": { "bool": { "must" : { "match": { "host": "1.1.1.1" } }, "filter": { "range" : { "@timestamp": { "gte": "now-1M/M", "lte": "now" } } } } }, "aggs" : { "total_usage" : { "sum" : { "field" : "netflow.in_bytes"} } }, "size": 0 }

But if I put it into a watch and simulate it I get "Watcher: [parse_exception] could not parse watch execution request. unexpected token [VALUE_STRING]".

I can't figure out what is wrong because it appears similar to the examples.

{
  "trigger": {
    "schedule": {
      "interval": "30m"
    }
  },
 "input": {
    "search": {
      "request": {
        "indices": [ "netflow-*" ],
        "types": "netflow",
  "body": {
    "query": {
        "bool": {
            "must" : {
                "match": {
                    "host": "1.1.1.1"
                }
            },
            "filter": {
                "range" : {
                    "@timestamp": {
                      "gte": "now-1M/M",
                      "lte": "now"
                    }
                }
            }
        }
    },
       "aggs" : {
      "total_usage" : { "sum" : { "field" : "netflow.in_bytes"} }
   },
    "size": 0
},
  "condition": {
    "compare": {
      "ctx.payload.total_usage": {
        "gte": 10000000
      }
    }
  },
  "actions": {
    "my-logging-action": {
      "logging": {
        "text": "Limit exceeded."
      }
    }
  }
}
}
}
}

It's working.

Had to put some } to close the input before condition. I also had to change the ctx.payload to include the value, didn't release ctx.payload.aggregation.total.value had to be written completely.

Next up:

  • sum the in and out fields because I can't use my scripted total field.
  • Setup email and include host & total value in that email.

Question:
Is it possible to have the aggregation run for a range of hosts, compare the value for each host, send an email based on true/false for each host and apply throttling for hosts that did have a match in e.g. the past 24 hours but not for hosts that did not?

My watch in case somebody reads this thread in the future:

{ "trigger": { "schedule": { "interval": "30m" } }, "input": { "search": { "request": { "indices": [ "netflow-*" ], "types": "netflow", "body": { "query": { "bool": { "must" : { "match": { "host": "1.1.1.1" } }, "filter": { "range" : { "@timestamp": { "gte": "now-1M/M", "lte": "now" } } } } }, "aggs" : { "total" : { "sum" : { "field" : "netflow.in_bytes"} } }, "size": 0 } } } }, "condition": { "compare": { "ctx.payload.aggregations.total.value": { "gte": 1073741824 } } }, "actions": { "my-logging-action": { "logging": { "text": "Limit exceeded." } } } }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.