Hi!
Thank you for replying.
I have a feeling that this is related to email configuration, because it started right after I enabled it. But let's break it down:
Kibana UI looks like this:
I have grepped through all the logs in all my 3 ES nodes.
They have multiple events like this every day:
[2018-01-25T11:38:03,082][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_elasticsearch_cluster_status]
[2018-01-25T11:40:01,045][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_elasticsearch_cluster_status]
[2018-01-25T11:50:59,713][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_elasticsearch_cluster_status]
[2018-01-25T12:59:59,605][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_elasticsearch_cluster_status]
[2018-01-25T13:24:58,094][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_kibana_version_mismatch]
[2018-01-25T15:57:58,422][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_kibana_version_mismatch]
[2018-01-25T17:07:58,112][WARN ][o.e.x.w.e.ExecutionService] [analyzer03] failed to execute watch [roClcHctTIWj8HFwteQHQw_kibana_version_mismatch]
The version mismatch is okay, it is 0.0.1 version back and forth, so it shouldn't really effect on this.
I think this is related:
/var/log/elasticsearch/analyzer-prod-2018-01-25.log.gz:[2018-01-25T13:09:18,286][ERROR][o.e.x.w.a.e.ExecutableEmailAction] [analyzer01] failed to execute action [roClcHctTIWj8HFwteQHQw_logstash_version_mismatch/send_email_to_admin]
/var/log/elasticsearch/analyzer-prod-2018-01-25.log.gz:javax.mail.MessagingException: failed to send email with subject [[RESOLVED] X-Pack Monitoring: Logstash Version Mismatch (roClcHctTIWj8HFwteQHQw)] via account [exchange_account]
Doesn't say why it fails.
If I query watcher-history from that same day, I found this:
GET .watcher-history-7-2018.01.25/_search
{
"query" : { "match" : { "watch_id": "roClcHctTIWj8HFwteQHQw_elasticsearch_cluster_status" }}
}
"actions": {
"send_email_to_admin": {
"ack": {
"timestamp": "2017-12-19T08:10:25.512Z",
"state": "awaits_successful_execution"
},
"last_execution": {
"timestamp": "2018-01-21T00:01:08.762Z",
"successful": false,
"reason": ""
}
The reason is empty.
I browsed through the watch history and tried to find "error", "fail", "action", "email" but I couldn't find any error messages.
I have configured email settings like this in elasticsearch.yml (on all nodes):
notification:
email:
html:
sanitization:
enabled: false
account:
exchange_account:
profile: outlook
email_defaults:
from: analyzer@company.tld
smtp:
starttls.enable: true
host: outlook.company.tld
port: 25
If I try to connect to the mail server with cURL, it works from all the nodes:
[root@analyzer03 ~]# curl --ssl-reqd -v smtp://outlook.company.tld
- About to connect() to outlook.company.tld port 25 (#0)
- Trying XXX.XXX.XXX.XXX...
- Connected to outlook.company.tld (XXX.XXX.XXX.XXX) port 25 (#0)
< 220 exch2k16.intra.company.tld Microsoft ESMTP MAIL Service ready at Wed, 7 Feb 2018 12:27:16 +0200
EHLO analyzer03
< 250-exch2k16.intra.company.tld Hello [XXX.XXX.XXX.XXX]
< 250-SIZE 36700160
< 250-PIPELINING
< 250-DSN
< 250-ENHANCEDSTATUSCODES
< 250-STARTTLS
< 250-8BITMIME
< 250-BINARYMIME
< 250 CHUNKING
STARTTLS
< 220 2.0.0 SMTP server ready
Thanks!