Ask a question about multiple hosts alarms

lightlevin · June 12, 2019, 12:38am

I need to monitor memory usage rate for multiple hosts, get data every five minutes, if the set threshold is exceeded, the alarm is triggered and a separate warning message is sent according to the host name. Hopefully a universal alarm setting. My current configuration:

{
  "trigger": {
    "schedule": {
      "interval": "5m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "metricbeat-*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "filter": {
                "range": {
                  "@timestamp": {
                    "gte": "{{ctx.trigger.scheduled_time}}||-5m",
                    "lte": "{{ctx.trigger.scheduled_time}}",
                    "format": "strict_date_optional_time||epoch_millis"
                  }
                }
              }
            }
          },
          "aggs": {
            "host": {
            	"terms": {
            		"field": "host.name"
            	},
            	"aggs": {
            		"metric": {
            			"avg": {
            				"field": "system.memory.used.pct"
            			}
            		}
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
      "lang": "painless",
      "params": {
        "threshold": 0.8
      }
    }
  },
  "actions": {
    "slack_1": {
      "transform": {
      	
        "script": {
          "source": "def df = new DecimalFormat('##.##'); return ['memory_used': df.format(ctx.payload.aggregations.metricAgg.value * params.percent), 'hostname': ctx.payload.hits.hits.0._source.host.name]",
          "lang": "painless",
          "params": {
            "percent": 100
          }
        }
      },
      "slack": {
        "message": {
          "to": [
            "#elk"
          ],
          "text": "Host {{ctx.payload.hostname}} memory alarm, alarm value is {{ctx.payload.memory_used}}% ."
        }
      }
    }
  }
}

output:

    "aggregations": {
      "host": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "doc_count": 146,
            "metric": {
              "value": 0.9885454545454546
            },
            "key": "es7_02"
          },
          {
            "doc_count": 139,
            "metric": {
              "value": 0.985
            },
            "key": "es7_01"
          }
        ]
      }
    }

I can get hostname of two hosts and the average system.memory.used.pct of five minutes , but only one host can be received warning message at a time, May I ask what I need to do to read key and value in the bucket in batch for alarm, or is there any other better method.

martinr_ubi · June 12, 2019, 2:46am

Do you want 1 watch to query for multiple hosts, find all those above threshold and execute 1 action with 1 message mentioning all the hosts that are above threshold?

Or you want 1 watch to query for the same as above but then execute the action PER host found to be above threshold, sending 1 msg per hosts?

The first one above is possible of course, the second one is not supported.
You would either create 1 watch per host to achieve 1 msg per host or if that sound insane in terms of required supporting automation and coordination in your environment, the other option is to the ship the list of offending hosts, or a list of messages concerning them to something that will do the demultiplexing for you. Like a webhook into an a simple program that will ship an individual message PER host that the watch sent over, or a webhook into a simple program that will take a list or finished messages ready to be sent and will send individually.

Short: watcher doesn’t have “execute action per item found in datastructure”.
It only has execute a predefined list of actions if the associated condition is true, but that doesn’t cover dynamically sending as much message as there are offending hosts in a multi host alarm.

lightlevin · June 12, 2019, 4:07am

Thank you for your reply, I really want to implement the second one，Since it can't be implemented now, how can I implement the first one?

system · July 10, 2019, 4:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Watcher for multiple hosts Elasticsearch elastic-stack-alerting	5	2025	April 18, 2019
Trigger alert on field exceeding limit per host before it finds other host when its throttled Elasticsearch elastic-stack-alerting	5	498	November 20, 2018
Chain input for customized alerts during throttled period Elasticsearch elastic-stack-alerting	5	1040	April 25, 2018
Watcher help send mutple alerts Elasticsearch	6	346	April 29, 2019
Alerting In AWS Elasticsearch Kibana Elasticsearch elastic-stack-monitoring , elastic-stack-alerting	2	525	November 3, 2020

Ask a question about multiple hosts alarms

Related topics