Create WATCHER with dynamic search value

kibana version: 7.4.1

Description:
We want to create a WATCHER event that triggers an alert (slack, pd, email, etc.) when a field is found with the same string but that string is unknown (can not hard code it into WATCHER) over a specific time period.

example:
end result should be that within a 5 minute window an alert was triggered 2 times for field_a.
1st alert: field_a == "location_1" (*2 total results less than 5 minutes*)
2nd alert: field_a == "location_3" (*3 total results less than 5 minutes*)

    **_doc:1**
    field_a == "location_1"

    **_doc:2**
    field_a == "location_3"

    **_doc:3**
    field_a == "location_1"

    **_doc:4**
    field_a == "location_6"

    **_doc:5**
    field_a == "location_3"

    **_doc:6**
    field_a == "location_3"

Hello @greg.melasecca

I think the following example should make it.

Demo data

PUT demodata/_doc/1
{
  "@timestamp": "2020-05-12T09:10:40.828Z",
  "field_a": "location_1"
}
PUT demodata/_doc/2
{
  "@timestamp": "2020-05-12T09:10:41.828Z",
  "field_a": "location_3"
}
PUT demodata/_doc/3
{
  "@timestamp": "2020-05-12T09:10:42.828Z",
  "field_a": "location_1"
}
PUT demodata/_doc/4
{
  "@timestamp": "2020-05-12T09:10:43.828Z",
  "field_a": "location_6"
}
PUT demodata/_doc/5
{
  "@timestamp": "2020-05-12T09:10:44.828Z",
  "field_a": "location_3"
}
PUT demodata/_doc/6
{
  "@timestamp": "2020-05-12T09:10:45.828Z",
  "field_a": "location_3"
}

Watch

You can use a terms aggregation and then a bucket_filter to get only terms with doc_count > 1.
Is still worth mentioning the terms aggregations are approximative and this can be heavy on global ordinals (as the field_a is a keyword) if the cardinality of field_a is really high (e.g. 10000+ or so).

You'll have to adjust the range filter to consider just the last 5 minutes.

 POST _watcher/watch/_execute
{
  "watch": {
    "trigger": {
      "schedule": {
        "interval": "30m"
      }
    },
    "input": {
      "search": {
        "request": {
          "indices": [
            "demodata"
          ],
          "body": {
            "query": {
              "bool": {
                "filter": [
                  {
                    "range": {
                      "@timestamp": {
                        "from": "now-1d",
                        "to": "now"
                      }
                    }
                  }
                ]
              }
            },
            "size": 0,
            "aggs": {
              "field_a_groups": {
                "terms": {
                  "field": "field_a",
                  "size": 100
                },
                "aggs": {
                  "filter_groups": {
                    "bucket_selector": {
                      "buckets_path": {
                        "count": "_count"
                      },
                      "script": "params.count > 1"
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "condition": {
      "script": {
        "source": "ctx.payload.aggregations.field_a_groups.buckets.size() > 0"
      }
    },
    "actions": {
      "log": {
        "logging": {
          "text": "We have a the following values:\n{{#ctx.payload.aggregations.field_a_groups.buckets}}{{key}}({{doc_count}})\n{{/ctx.payload.aggregations.field_a_groups.buckets}}"
        }
      }
    }
  }
}

Maybe transforms?

Transform Jobs can be used to build an entity centric index.

POST _transform/_preview
{
  "id": "demotransform",
  "source": {
    "index": [
      "demodata"
    ],
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "transformeddemodata"
  },
  "sync": {
    "time": {
      "field": "@timestamp",
      "delay": "15m"
    }
  },
  "pivot": {
    "group_by": {
      "time": {
        "date_histogram": {
          "field": "@timestamp",
          "fixed_interval": "5m"
        }
      },
      "field_a_groups": {
        "terms": {
          "field": "field_a"
        }
      }
    },
    "aggregations": {
      "count": {
        "value_count": {
          "field": "field_a"
        }
      }
    }
  }
}

Result:

{
  "preview" : [
    {
      "field_a_groups" : "location_1",
      "count" : 2.0,
      "time" : 1589274600000
    },
    {
      "field_a_groups" : "location_3",
      "count" : 3.0,
      "time" : 1589274600000
    },
    {
      "field_a_groups" : "location_6",
      "count" : 1.0,
      "time" : 1589274600000
    }
  ],
  "mappings" : {
    "properties" : {
      "field_a_groups" : {
        "type" : "keyword"
      },
      "count" : {
        "type" : "long"
      },
      "time" : {
        "type" : "date"
      }
    }
  }
}

Based on that it would be possible to run a query based on such results.

Awesome. thank you Luca for the information. Let me run a couple of tests and I will update here.

1 Like