Hipchat Integration Errors

alerting

(Matthew) #1

Hey there,

I have recently been working with setting up Watches for my company. Now that my bELK stack is in production, I would like to separate the apache errors that we are receiving into two distinct rooms. One for dev errors and one for prod. When I scripted the change using puppet, things were not working properly. Below is the following configurations and the error I am receiving:

elasticsearch.yml:

path.data: /usr/share/elasticsearch/data/es-01
path.logs: /var/log/elasticsearch/es-01
network.host: 0.0.0.0
http.port: 9200
watcher.actions.hipchat.service:
   account:
     notify-dev-monitoring:
       profile: integration
       auth_token: 0vJKWV96XQVCiOp2GhkHhWALxmxF9BHNfnqObBTW
       room: testmon

Hipchat Settings:

Token: 0vJKWV96XQVCiOp2GhkHhWALxmxF9BHNfnqObBTW
Label: test
Scopes: Send Notification

curl:

curl -XPUT 'localhost:9200/_watcher/watch/api-error' -d '{
      "trigger" : {
          "schedule" : { "interval" : "10s" } 
        },
        "input" : {
          "search" : {
            "request" : {
              "indices" : [ "filebeat" ],
              "body" : {
                "query" : {
                  "filtered" : {
                    "query" : {
                  "match_phrase" : { "source": "/var/www/html/logs/error_log" }
                  },
                  "filter" : {
                    "bool": {
                    "must": [
                    {
                      "range": {
                        "@timestamp" : {
                        "from" : "now-5m",
                        "to" : "now"
                        }
                      }
                    }
                    ]
                  }
                }
              }
            }
          }
          }
          }
        },
        "actions" : {
        "notify-hipchat" : {
          "throttle_period" : "5m",
          "hipchat" : {
            "account" : "notify-dev-monitoring",
            "message" : {
              "body": "New error seen in api apache error_log!\n\nHost:   {{ctx.payload.hits.hits.0._source.beat.hostname}}\nMessage:   {{ctx.payload.hits.hits.0._source.message}}",
              "format" : "text",
              "color" : "red",
              "notify" : true
            }
          }
        }
      }
      }'

Error received:

{"error":{"root_cause":[{"type":"parse_exception","reason":"could not parse [hipchat] action [api-error/null]. unknown hipchat account [notify-dev-monitoring]"}],"type":"parse_exception","reason":"could not parse [hipchat] action [api-error/null]. unknown hipchat account [notify-dev-monitoring]"},"status":400}

Noting the status=400, I realize it is on the client side. The weird part is, if I keep elasticsearch.yml the same as above, but change the curl to the old account name, it works and will continue to send to the existing channel. After changing the configs, I restart the service so not sure why it is cached and not recognizing the new config.

Watcher version: 2.3.5


(Steve Kearns) #2

Watcher executes on the currently elected master node, so when you make changes to the elasticsearch.yml , you will effectively need a full cluster restart (or at least a complete rolling restart of all master-eligible nodes.

Restarting a single node in a multi-node cluster isn't likely to do the trick!


(Matthew) #3

Steve,

Thanks for the response! Unfortunately, I only have a single cluster set up. This is why I am at a loss!


(Matthew) #4

Seems like it was due to the cluster status being red. Deleted all indexes and things started working properly. Obviously, this shouldn't be the case as we would like keeping data. Looking for suggestions on how to approach this problem in a different light. 1 cluster.. 1 node.


(system) #5