Watcher keeps firing when created via watcher api

Hi,

I've got several watcher that watch the metrics of kafka. Now something strange happens. When I create the watch via de API the watch keeps firing. When I manually save the watcher the status becomes OK. I tried copying the watcher to the json I'm inserting with the watch api but I cannot see any difference. We are using the Elastic Cloud on version 8.1.0.

The json of the watch

{
  "trigger": {
    "schedule": {
      "interval": "5s"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          ".ds-metrics-kafka.consumergroup-*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "kafka.consumergroup.id": "admin-application-user"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-5m"
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "consumers": {
              "terms": {
                "field": "kafka.consumergroup.id"
              },
              "aggs": {
                "lag": {
                  "avg": {
                    "field": "kafka.consumergroup.consumer_lag"
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": "if (ctx.payload.hits.total == 0) { return true; } else { if(ctx.payload.aggregations.consumers.buckets[0].lag.value > params.lag_threshold) { return true; } return false; }",
      "lang": "painless",
      "params": {
        "lag_threshold": 5
      }
    }
  },
  "actions": {
    "notify-pagerduty": {
      "throttle_period_in_millis": 172800000,
      "pagerduty": {
        "description": "[Error] Found {{ ctx.payload.hits.total }} docs. The admin-application-user consumer is down or exceeds the lag threshold of 5. See the payload for details",
        "attach_payload": true,
        "account": "kafka"
      }
    }
  }
}

Any suggestions on how to tackle this problem?

Gerard

If you put the watch, the status is always reset. Feel free to create an issue in the Elasticsearch repository about that, if that is something you are missing, along with an example/reason.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.