Watcher alerts email issue

Hi,
actually i created watcher alerts in my ml jobs anomaly score reaches above 70 but i am not getting email notification showing the socket timed out exception

how to resolve this issue ?

elasticsearch .yml watcher settings :
xpack.notification.email.account:
exchange_account:
profile: outlook
email_defaults:
from: mejari.vishnu.vardhan@domain.com
smtp:
auth: true
starttls.enable: true
host: smtp.office365.com
port: 587
user: mejari.vishnu.vardhan@domain.com

error in watcher :

{
"watch_id": "22bb41d4-09e5-4c86-81e6-f2f7aef2b402",
"node": "Ewo-SXYbROyijWhkmirJPw",
"state": "executed",
"status": {
"state": {
"active": true,
"timestamp": "2019-12-16T12:41:18.996Z"
},
"last_checked": "2019-12-16T12:42:43.418Z",
"last_met_condition": "2019-12-16T12:42:43.418Z",
"actions": {
"email_1": {
"ack": {
"timestamp": "2019-12-16T12:41:18.996Z",
"state": "awaits_successful_execution"
},
"last_execution": {
"timestamp": "2019-12-16T12:42:43.418Z",
"successful": false,
"reason": ""
}
}
},
"execution_state": "executed",
"version": -1
},
"trigger_event": {
"type": "schedule",
"triggered_time": "2019-12-16T12:42:43.417Z",
"schedule": {
"scheduled_time": "2019-12-16T12:42:43.051Z"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
".ml-anomalies-"
],
"rest_total_hits_as_int": true,
"body": {
"size": 0,
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "{{ctx.trigger.scheduled_time}}||-100d",
"lte": "{{ctx.trigger.scheduled_time}}",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"max": {
"field": "anomaly_score"
}
}
}
}
}
}
},
"condition": {
"script": {
"source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
"lang": "painless",
"params": {
"threshold": 69
}
}
},
"metadata": {
"name": "Alert_watcher_demo",
"watcherui": {
"trigger_interval_unit": "m",
"agg_type": "max",
"time_field": "timestamp",
"trigger_interval_size": 1,
"term_size": 5,
"time_window_unit": "d",
"threshold_comparator": ">",
"term_field": null,
"index": [
".ml-anomalies-
"
],
"time_window_size": 100,
"threshold": 69,
"agg_field": "anomaly_score"
},
"xpack": {
"type": "threshold"
}
},
"result": {
"execution_time": "2019-12-16T12:42:43.418Z",
"execution_duration": 120227,
"input": {
"type": "search",
"status": "success",
"payload": {
"_shards": {
"total": 1,
"failed": 0,
"successful": 1,
"skipped": 0
},
"hits": {
"hits": ,
"total": 2231,
"max_score": null
},
"took": 21,
"timed_out": false,
"aggregations": {
"metricAgg": {
"value": 93.91975
}
}
},
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
".ml-anomalies-*"
],
"rest_total_hits_as_int": true,
"body": {
"size": 0,
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2019-12-16T12:42:43.051Z||-100d",
"lte": "2019-12-16T12:42:43.051Z",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"max": {
"field": "anomaly_score"
}
}
}
}
}
}
},
"condition": {
"type": "script",
"status": "success",
"met": true
},
"transform": {
"type": "script",
"status": "success",
"payload": {
"result": 93.91975
}
},
"actions": [
{
"id": "email_1",
"type": "email",
"status": "failure",
"error": {
"root_cause": [
{
"type": "messaging_exception",
"reason": "failed to send email with subject [Watch [Alert_watcher_demo] has exceeded the threshold] via account [exchange_account]"
}
],
"type": "messaging_exception",
"reason": "failed to send email with subject [Watch [Alert_watcher_demo] has exceeded the threshold] via account [exchange_account]",
"caused_by": {
"type": "mail_connect_exception",
"reason": "Couldn't connect to host, port: smtp.office365.com, 587; timeout 120000",
"caused_by": {
"type": "socket_timeout_exception",
"reason": "connect timed out"
}
}
}
}
]
},
"messages":
}

please take the time to properly format your messages. This forum supports markdown and thus code snippets, which will make configuration snippets or JSON much easier to read. Thanks!

Couldn't connect to host, port: smtp.office365.com, 587; timeout 120000

This means, that the office365 could not be connected to from the node which executed the watch, a data node, if you are using Elasticsearch 6 and above. Can you ensure that every data node can connect to the mailserver - which might mean asking your network administrator if there is a firewall issue.

Hi ,
In my cluster there 5 nodes { 2 master, 2 data ,1 ml ) shall i configure email on all the nodes right ?
how to check firewall issue there or not ?

the easiest is to have the same configuration of all nodes.

regarding the firewall issue, please check my comment above about talking to your network administrator. You can help the administrator by trying to use telnet to connect to the SMTP server and see if that works.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.