Elastic Watcher Alert Failing (Memory & Cpu Usage)

pgcr · February 4, 2016, 3:30pm

I have been attempting to setup a watch using the template that can be found here: https://www.elastic.co/guide/en/watcher/current/watching-marvel-data.html

Current setup: 1 master node with 2 backups; 3 data nodes (3 shards for each); 1 clientnode. Elasticsearch is at 2.1.1, fluentd 2.3.0, watcher plugin latest ver, marvel plugin latest ver, license also installed (no account). This setup is running on CentOs on AWS.

The two alerts I am setting up is High CPU usage and high jvm memory usage, for testing purposes I have set the alert to notify me if they are above 3% with an interval of 10s. Using plugin/head I am able to determine that these are in fact running every 10s, but normally I receive execution_not_needed or failed.

When checking the log under condition I see:
"condition": {
"type": "script",
"status": "failure",
"reason": "GroovyScriptExecutionException[failed to run inline script [if (ctx.payload.aggregations.minutes.buckets.size() == 0) return false; def latest = ctx.payload.aggregations.minutes.buckets[-1]; def node = latest.nodes.buckets[0]; return node && node.memory && node.memory.value >= 3;] using lang [groovy]]; nested: NullPointerException[Cannot get property 'minutes' on null object]; "
},

I have
script.inline: on
script.indexed: on
On all data nodes and master node.

Any help & information is greatly appreciated.

spinscale · February 8, 2016, 8:22am

Hey,

can you use the execute Watch API and paste the output here?

Thanks!

--Alex

pgcr · February 8, 2016, 3:01pm

Hey @spinscale,

After executing the watch, I receive the same error.

{
"_id": "mem_watch_9-2016-02-08T14:58:55.046Z",
"watch_record": {
"watch_id": "mem_watch",
"state": "executed",
"trigger_event": {
"type": "manual",
"triggered_time": "2016-02-08T14:58:55.037Z",
"manual": {
"schedule": {
"scheduled_time": "2016-02-08T14:58:55.045Z"
}
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
".marvel-*"
],
"types": [],
"body": {
"size": 0,
"query": {
"filtered": {
"filter": {
"range": {
"@timestamp": {
"gte": "now-2m",
"lte": "now"
}
}
}
}
},
"aggs": {
"minutes": {
"date_histogram": {
"field": "@timestamp",
"interval": "minute"
},
"aggs": {
"nodes": {
"terms": {
"field": "node.name.raw",
"size": 10,
"order": {
"memory": "desc"
}
},
"aggs": {
"memory": {
"avg": {
"field": "jvm.mem.heap_used_percent"
}
}
}
}
}
}
}
}
}
}
},
"condition": {
"script": "if (ctx.payload.aggregations.minutes.buckets.size() == 0) return false; def latest = ctx.payload.aggregations.minutes.buckets[-1]; def node = latest.nodes.buckets[0]; return node && node.memory && node.memory.value >= 3;"
},
"messages": [],
"result": {
"execution_time": "2016-02-08T14:58:55.046Z",
"execution_duration": 97,
"input": {
"type": "simple",
"status": "success",
"payload": {
"foo": "bar"
}
},
"condition": {
"type": "always",
"status": "success",
"met": true
},
"actions": [
{
"id": "send_email",
"type": "email",
"status": "failure",
"transform": {
"type": "script",
"status": "failure",
"reason": "GroovyScriptExecutionException[failed to run inline script [def latest = ctx.payload.aggregations.minutes.buckets[-1]; return latest.nodes.buckets.findAll { return it.memory && it.memory.value >= 3 };] using lang [groovy]]; nested: NullPointerException[Cannot get property 'minutes' on null object]; "
},
"reason": "Failed to transform payload"
}
]
}
}
}

Regards,
Petro

spinscale · February 9, 2016, 10:34am

Hey,

tested locally. You dont have any marvel data to check against (thats how I get this error reproduced). What happens here is, that the watch expects the aggregations data structure to be there, what only happens, if data has been indexed.

Have you installed the marvel-agent and is it indexing into your local cluster?

--Alex

pgcr · February 9, 2016, 8:10pm

Hey Alex,

Yep, I have marvel-agent running on the cluster (double-checked). Using _plugin/head/browser I can see the marvel files being generated...

The only thing I can think of is that i have not re-indexed the files manually.

spinscale · February 10, 2016, 10:32am

Hey,

something is wrong with your watch, it does not execute a search query. Check the result section of your pasted response, it shows a simple input...

--Alex

iqbal_nazir · June 9, 2016, 9:56am

Hi Alex,
I am also having similar type of issue with watcher. I don't receive any email for cpu and memory usage. I know my email configuration in elasticsearch.yml is correct because I receive email for another watch. I have followed https://www.elastic.co/guide/en/watcher/current/watching-marvel-data.html#watching-cpu-usage and set the cpu usage to 5% just to check if I receive any email. After reading this post I have checked POST _watcher/watch/cpu_usage/_execute which shows me output like this...
{
"_id": "cpu_usage_168-2016-06-09T09:44:12.366Z",
"watch_record": {
"watch_id": "cpu_usage",
"state": "execution_not_needed",
"trigger_event": {
"type": "manual",
"triggered_time": "2016-06-09T09:44:12.366Z",
"manual": {
"schedule": {
"scheduled_time": "2016-06-09T09:44:12.366Z"
...
....
...
I have checked in marvel that my node is consuming more than 10% cpu all the time. Still I don't receive any email. Do you have any solution for me? (I'm a beginner in Elasticsearch and everything..so detailed answer would be really appreciated)
thanks in advance.
--Iqbal

spinscale · June 9, 2016, 10:31am

hey,

please open a new thread and include the output of calling the Execute Watch Api.

--Alex

Topic		Replies	Views
Memory Usage AND CPU Usage Not Work! Elasticsearch	7	2449	August 2, 2017
CPU usage and Memory usage Elasticsearch	4	2514	June 28, 2017
Jvm heap monitoring Elasticsearch elastic-stack-alerting	18	3096	January 2, 2018
How to update the field via Painless Scripting in a Watcher rule Elasticsearch elastic-stack-alerting	4	3499	January 17, 2017
Watching watcher Elasticsearch elastic-stack-alerting	4	1150	July 6, 2017

Elastic Watcher Alert Failing (Memory & Cpu Usage)

Related topics