Watcher with slack using proxy

alerting

(Akhilesh Anb) #1

Hi,

We have given the condition for cluster health monitoring. We are behind the proxy.
So we have seen the error,

"sent_messages": [
{
"status": "failure",
"reason": "UnknownHostException[hooks.slack.com]",

Now, I added these lines to elasticsearch.yml

watcher.http.proxy:
host: ip-address
port: 80

After adding proxy settings, I see this error,

ElasticsearchTimeoutException[failed to execute http request. timeout expired]; nested: SocketTimeoutException[Read timed out]; slack

Please help me on this


(Akhilesh Anb) #2

I haved added these two lines as well.

watcher.http.default_connection_timeout: 5s
watcher.http.default_read_timeout: 20s

still i'm facing the same problem


(Alexander Reelsen) #3

So, this looks as if you cannot connect to hooks.slack.com. Can you provide more information, which Elasticsearch version you are using?

Can you also provide the full watch and a full output of the watch history or the execute watch api for better debugging.

Also, please take the time for proper formatting, as the outputs will be pretty long I suppose.

Thanks a lot!


(Akhilesh Anb) #4

Elasticsearch version :- 2.3.5

This is the watcher condition i'm using

{
"trigger" : {
"schedule" : { "interval" : "2m" }
},
"input" : {
"http" : {
"request" : {
"host" : "3.3.47.215
"port" : 9200,
"path" : "/_cluster/health",
"auth" : {
"basic" : {
"username" : "akhil",
"password" : "akhil@123"
}
}
}
}
},
"condition" : {
"compare" : {
"ctx.payload.status" : { "eq" : "red" }
}
},
"actions" : {
"notify-slack" : {
"slack" : {
"message" : {
"to" : [ "#opt-es", "@akhilesh_appana" ],
"text" : "cluster_health alert: Someone needs to look at the SPOT-Stage cluster. It appears to be in a RED state. (facepalm)"
}
}
}
}
}


(Akhilesh Anb) #5

This is the output before providing proxy in elasticsearch.yml

"input": {
            "http": {
              "request": {
                "scheme": "http",
                "host": "3.3.47.215",
                "port": 9200,
                "method": "get",
                "path": "/_cluster/health",
                "params": {},
                "headers": {},
                "auth": {
                  "basic": {
                    "username": "akhil",
                    "password": "akhil@123"
                  }
                }
              }
            }
          },
          "condition": {
            "compare": {
              "ctx.payload.status": {
                "eq": "red"
              }
            }
          },
          "messages": [],
          "result": {
            "execution_time": "2017-04-12T04:27:26.904Z",
            "execution_duration": 31,
            "input": {
              "type": "http",
              "status": "success",
              "payload": {
                "cluster_name": "spot-qa",
                "status": "red",
                "timed_out": false,
                "number_of_nodes": 2,
                "number_of_data_nodes": 2,
                "active_primary_shards": 86,
                "active_shards": 153,
                "relocating_shards": 0,
                "initializing_shards": 4,
                "unassigned_shards": 15,
                "delayed_unassigned_shards": 0,
                "number_of_pending_tasks": 0,
                "number_of_in_flight_fetch": 0,
                "task_max_waiting_in_queue_millis": 0,
                "active_shards_percent_as_number": 88.95348837209302
              },
              "http": {
                "request": {
                  "host": "3.3.87.248",
                  "port": 9200,
                  "scheme": "http",
                  "method": "get",
                  "path": "/_cluster/health",
                  "auth": {
                    "username": "spotadmin",
                    "password": "sp0t@dm1n"
                  }
                },
                "status_code": 200
              }
            },
            "condition": {
              "type": "compare",
              "status": "success",
              "met": true,
              "compare": {
                "resolved_values": {
                  "ctx.payload.status": "red"
                }
              }
            },
            "actions": [
              {
                "id": "notify-slack",
                "type": "slack",
                "status": "failure",
                "slack": {
                  "account": "monitoring",
                  "sent_messages": [
                    {
                      "status": "failure",
                      "reason": "UnknownHostException[hooks.slack.com]",
                      "to": "#opt-es",
                      "message": {
                        "from": "Watcher",
                        "text": "cluster_health alert: Someone needs to look at the SPOT-Stage cluster. It appears to be in a RED state. (facepalm)"
                      }
                    },
                    {
                      "status": "failure",
                      "reason": "UnknownHostException[hooks.slack.com]",
                      "to": "@akhilesh_appana",
                      "message": {
                        "from": "Watcher",
                        "text": "cluster_health alert: Someone needs to look at the SPOT-Stage cluster. It appears to be in a RED state. (facepalm)"
                      }
                    }
                  ]
                }
              }
            ]
          }

(Akhilesh Anb) #6

This is the output after adding the proxy and timeout in elasticsearch.yml

> {
>   "took": 6,
>   "timed_out": false,
>   "_shards": {
>     "total": 1,
>     "successful": 1,
>     "failed": 0
>   },
>   "hits": {
>     "total": 492,
>     "max_score": 1,
>     "hits": [
>       {
>         "_index": ".watch_history-2017.04.16",
>         "_type": "watch_record",
>         "_id": "cluster_health_watch_1129-2017-04-16T00:00:43.711Z",
>         "_score": 1,
>         "_source": {
>           "watch_id": "cluster_health_watch",
>           "state": "failed",
>           "trigger_event": {
>             "type": "schedule",
>             "triggered_time": "2017-04-16T00:00:43.711Z",
>             "schedule": {
>               "scheduled_time": "2017-04-16T00:00:43.388Z"
>             }
>           },
>           "input": {
>             "http": {
>               "request": {
>                 "scheme": "http",
>                 "host": "3.3.47.215",
>                 "port": 9200,
>                 "method": "get",
>                 "path": "/_cluster/health",
>                 "params": {},
>                 "headers": {},
>                 "auth": {
>                   "basic": {
>                     "username": "akhil",
>                     "password": "akhil@123"
>                   }
>                 }
>               }
>             }
>           },
>           "condition": {
>             "compare": {
>               "ctx.payload.status": {
>                 "eq": "red"
>               }
>             }
>           },
>           "messages": [
>             "failed to execute watch input"
>           ],
>           "result": {
>             "execution_time": "2017-04-16T00:00:43.711Z",
>             "execution_duration": 20024,
>             "input": {
>               "type": "http",
>               "status": "failure",
>               "reason": "ElasticsearchTimeoutException[failed to execute http request. timeout expired]; nested: SocketTimeoutException[Read timed out]; ",
>               "http": {
>                 "request": {
>                   "host": "3.3.47.215",
>                   "port": 9200,
>                   "scheme": "http",
>                   "method": "get",
>                   "path": "/_cluster/health",
>                   "auth": {
>                     "username": "akhil",
>                     "password": "akhil@123"
>                   }
>                 }
>               }
>             },
>             "actions": []
>           }
>         }
>       },

(Akhilesh Anb) #7

When i try to execute the watch api,

> {
>   "error": {
>     "root_cause": [
>       {
>         "type": "parse_exception",
>         "reason": "could not parse watch execution request. unexpected object field [trigger]"
>       }
>     ],
>     "type": "parse_exception",
>     "reason": "could not parse watch execution request. unexpected object field [trigger]"
>   },
>   "status": 400
> }

(Alexander Reelsen) #8

without proxy this looks like a DNS issue.
With the proxy used this looks like a firewall issue.

You should use sth like tcpdump to debug this further, as this looks very much like a network issue and not like a watcher issue.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.