Transform or Other Solution to set null value to 0 on first run


#1

In the below example for: yellow state for ({{ctx.payload.first.hits.total}}) and red state ({{ctx.payload.second.hits.total}}). The first time a yellow status triggers the alert there is nothing in the index so the alert would look like this:

Cluster has been in a yellow state for () minutes and a red state for () minutes over the past hour Current status is yellow.

How would I do a transform or something else so the first time it triggers it would look like this:

Cluster has been in a yellow state for (0) minutes and a red state for (0) minutes over the past hour Current status is yellow.

then after it would just use the hit counter for ctx.payload.first and ctx.payload.second on the next min through if the cluster is still in a yellow or red state?

  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "chain": {
      "inputs": [
        {
          "first": {
            "search": {
              "request": {
                "search_type": "query_then_fetch",
                "indices": [
                  "watch_cluster_health"
                ],
                "types": [],
                "body": {
                  "query": {
                    "bool": {
                      "must": [
                        {
                          "match": {
                            "cluster_state": "yellow"
                          }
                        },
                        {
                          "range": {
                            "Time": {
                              "gte": "now-1h"
                            }
                          }
                        }
                      ]
                    }
                  }
                }
              }
            }
          }
        },
        {
          "second": {
            "search": {
              "request": {
                "search_type": "query_then_fetch",
                "indices": [
                  "watch_cluster_health"
                ],
                "types": [],
                "body": {
                  "query": {
                    "bool": {
                      "must": [
                        {
                          "match": {
                            "cluster_state": "red"
                          }
                        },
                        {
                          "range": {
                            "Time": {
                              "gte": "now-1h"
                            }
                          }
                        }
                      ]
                    }
                  }
                }
              }
            }
          }
        },
        {
          "third": {
            "http": {
              "request": {
                "scheme": "http",
                "host": "localhost",
                "port": 9200,
                "method": "get",
                "path": "/_cluster/health",
                "params": {},
                "headers": {}
              }
            }
          }
        }
      ]
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.third.status": {
        "not_eq": "green"
      }
    }
  },
  "actions": {
    "notify-slack1": {
      "slack": {
        "message": {
          "to": [
            "slack_channel"
          ],
          "text": "Cluster has been in a yellow state for ({{ctx.payload.first.hits.total}}) minutes and a red state for ({{ctx.payload.second.hits.total}}) minutes over the past hour Current status is {{ctx.payload.third.status}}."
        }
      }
    },
    "index_payload": {
      "transform": {
        "script": {
          "source": "return [ 'Time': ctx.execution_time, 'cluster_state' : ctx.payload.third.status ]",
          "lang": "painless"
        }
      },
      "index": {
        "index": "watch_cluster_health",
        "doc_type": "_doc"
      }
    }
  }
}```

(Alexander Reelsen) #2

Hey,

can you maybe ignore all the technical details (like the watch itself) and just explain, what you are trying to achieve here in general? I am not fully sure, how useful this alerting is, and would like to understand the full background first, before giving any further advice.

I think that any red state should be alerted on immediately, if this is a production cluster. Then it also might make sense to include the incides that have red or yellow states.

Also, how do you populate the watch_cluster_health index? How do the documents look like?

--Alex


#3

Sure, I just want to show how many minutes the cluster has been in another state for the past hour when the alert triggers. We use a basic health status alert now and it triggers a one minute yellow alert every so often. So instead of digging when I get the alert the first one it would read 0 min yellow 0 red and current status yellow, so we know it was just one of the one minute yellow statuses and if the minutes in the past hour start piling up we know to go investigate.


#4

watch_cluster_health is populated by the index action from the alert, which would keep track of whenever the state changes for the alert.

      "transform": {
        "script": {
          "source": "return [ 'Time': ctx.execution_time, 'cluster_state' : ctx.payload.third.status ]",
          "lang": "painless"
        }
      },
      "index": {
        "index": "watch_cluster_health",
        "doc_type": "_doc"
      }
    }```

#5

the index watch_cluster_health:

Time June 22nd 2018, 12:29:49.512
t _id O5FUKGQBx5mdMHKElqNx
t _index watch_cluster_health

_score -

t _type _doc
t cluster_state yellow


(Alexander Reelsen) #6

There is zero guarantee how many minutes this happened in real, it just happened when the monitoring happened I suppose?

However, what about using a terms aggregation to count for the number of red and yellow occurances? So you can see you have indexed x documents with a yellow state and y with a red state?

Would that give you the information you need?


#7

Alert checks every minute so it would be pretty close to recording those values. I think I would run into the same problem with aggregation. That alert above pretty much works, I just wanted it to look nicer on the first alert for a yellow or red cluster. Because on the first alert the index is empty the values are null, so I just wanted some way to replace the null value on the first alerts with zeros. right now it would look like ()yellow ()red current status yellow. Second time the alert fires in a yellow status within the hour it would look like this (1) yellow () red current status yellow. I want to get it to look like this on the first time (0) yellow (0) red current status yellow, so basically just swapping the null for a 0. I can't figure out how to do this but I'm sure there is probably a way with painless, a transform, or some other way I'm not thinking of or not aware of.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.