Transform or Other Solution to set null value to 0 on first run

dmdm · June 22, 2018, 7:09pm

In the below example for: yellow state for ({{ctx.payload.first.hits.total}}) and red state ({{ctx.payload.second.hits.total}}). The first time a yellow status triggers the alert there is nothing in the index so the alert would look like this:

Cluster has been in a yellow state for () minutes and a red state for () minutes over the past hour Current status is yellow.

How would I do a transform or something else so the first time it triggers it would look like this:

Cluster has been in a yellow state for (0) minutes and a red state for (0) minutes over the past hour Current status is yellow.

then after it would just use the hit counter for ctx.payload.first and ctx.payload.second on the next min through if the cluster is still in a yellow or red state?

  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "chain": {
      "inputs": [
        {
          "first": {
            "search": {
              "request": {
                "search_type": "query_then_fetch",
                "indices": [
                  "watch_cluster_health"
                ],
                "types": [],
                "body": {
                  "query": {
                    "bool": {
                      "must": [
                        {
                          "match": {
                            "cluster_state": "yellow"
                          }
                        },
                        {
                          "range": {
                            "Time": {
                              "gte": "now-1h"
                            }
                          }
                        }
                      ]
                    }
                  }
                }
              }
            }
          }
        },
        {
          "second": {
            "search": {
              "request": {
                "search_type": "query_then_fetch",
                "indices": [
                  "watch_cluster_health"
                ],
                "types": [],
                "body": {
                  "query": {
                    "bool": {
                      "must": [
                        {
                          "match": {
                            "cluster_state": "red"
                          }
                        },
                        {
                          "range": {
                            "Time": {
                              "gte": "now-1h"
                            }
                          }
                        }
                      ]
                    }
                  }
                }
              }
            }
          }
        },
        {
          "third": {
            "http": {
              "request": {
                "scheme": "http",
                "host": "localhost",
                "port": 9200,
                "method": "get",
                "path": "/_cluster/health",
                "params": {},
                "headers": {}
              }
            }
          }
        }
      ]
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.third.status": {
        "not_eq": "green"
      }
    }
  },
  "actions": {
    "notify-slack1": {
      "slack": {
        "message": {
          "to": [
            "slack_channel"
          ],
          "text": "Cluster has been in a yellow state for ({{ctx.payload.first.hits.total}}) minutes and a red state for ({{ctx.payload.second.hits.total}}) minutes over the past hour Current status is {{ctx.payload.third.status}}."
        }
      }
    },
    "index_payload": {
      "transform": {
        "script": {
          "source": "return [ 'Time': ctx.execution_time, 'cluster_state' : ctx.payload.third.status ]",
          "lang": "painless"
        }
      },
      "index": {
        "index": "watch_cluster_health",
        "doc_type": "_doc"
      }
    }
  }
}```

spinscale · June 25, 2018, 9:58am

Hey,

can you maybe ignore all the technical details (like the watch itself) and just explain, what you are trying to achieve here in general? I am not fully sure, how useful this alerting is, and would like to understand the full background first, before giving any further advice.

I think that any red state should be alerted on immediately, if this is a production cluster. Then it also might make sense to include the incides that have red or yellow states.

Also, how do you populate the watch_cluster_health index? How do the documents look like?

--Alex

dmdm · June 25, 2018, 1:10pm

Sure, I just want to show how many minutes the cluster has been in another state for the past hour when the alert triggers. We use a basic health status alert now and it triggers a one minute yellow alert every so often. So instead of digging when I get the alert the first one it would read 0 min yellow 0 red and current status yellow, so we know it was just one of the one minute yellow statuses and if the minutes in the past hour start piling up we know to go investigate.

dmdm · June 25, 2018, 1:13pm

watch_cluster_health is populated by the index action from the alert, which would keep track of whenever the state changes for the alert.

      "transform": {
        "script": {
          "source": "return [ 'Time': ctx.execution_time, 'cluster_state' : ctx.payload.third.status ]",
          "lang": "painless"
        }
      },
      "index": {
        "index": "watch_cluster_health",
        "doc_type": "_doc"
      }
    }```

dmdm · June 25, 2018, 1:15pm

the index watch_cluster_health:

Time June 22nd 2018, 12:29:49.512
t _id O5FUKGQBx5mdMHKElqNx
t _index watch_cluster_health

_score -

t _type _doc
t cluster_state yellow

spinscale · June 27, 2018, 7:05am

There is zero guarantee how many minutes this happened in real, it just happened when the monitoring happened I suppose?

However, what about using a terms aggregation to count for the number of red and yellow occurances? So you can see you have indexed x documents with a yellow state and y with a red state?

Would that give you the information you need?

dmdm · June 27, 2018, 2:27pm

Alert checks every minute so it would be pretty close to recording those values. I think I would run into the same problem with aggregation. That alert above pretty much works, I just wanted it to look nicer on the first alert for a yellow or red cluster. Because on the first alert the index is empty the values are null, so I just wanted some way to replace the null value on the first alerts with zeros. right now it would look like ()yellow ()red current status yellow. Second time the alert fires in a yellow status within the hour it would look like this (1) yellow () red current status yellow. I want to get it to look like this on the first time (0) yellow (0) red current status yellow, so basically just swapping the null for a 0. I can't figure out how to do this but I'm sure there is probably a way with painless, a transform, or some other way I'm not thinking of or not aware of.

system · July 25, 2018, 2:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Watcher Cluster Health Elasticsearch elastic-stack-alerting	2	1175	July 6, 2017
Watcher with conditions fails on transform Elasticsearch elastic-stack-alerting	6	2317	October 11, 2018
Watcher Kibana elastic-stack-alerting , painless	3	294	December 22, 2020
Kibana alerts (cluster health yellow) on new index creation Kibana elastic-stack-alerting	5	917	December 7, 2020
Help converting Watcher Alert to Painless Elasticsearch	4	1933	December 29, 2016

Transform or Other Solution to set null value to 0 on first run

_score -

Related topics