Watcher Alert not working

Hello i want to create a watcher alert and the code that i use is the following:

{
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"search": {
"request": {
"body": {
"size": 0,
"query": {
"match_phrase": {
"host.hostname": {
"query": "sag-prd-cas-022.sag.services"
}
}
}
},
"indices": [
"apm-*"
]
}
}
},
"condition": {

        "ctx.payload.hits.hits.apm-*.transaction.duration.us": {
          "gte": 100000,
          "lt": 110000
        }

},
"actions": {
"send_email" : {
"email" : {
"to" : "<alexandros.ananikidis@sag-ag.ch",
"subject" : "Watcher Notification",
"body" : "Alex testing- Duration 10 to 11 seconds"
}
}
}
}

Nevertheless i get the following error message:
[parse_exception] could not parse condition for watch [a22ee7aa-41da-4b73-a24a-69979db33bf1]. unknown condition type [ctx.payload.hits.hits.apm-*.transaction.duration.us]

The index name that i use is: apm-*
and the field that i want to check is the one shown in the image below:

What am i doing wrong?

Thank you

Hey,

Please take your time to properly format messages using markdown. This is pretty hard to read without any indendation. Thanks!

I think your approach for the condition here is wrong. If you are searching for transactions with a certain duration, than you should use a range query as part of the search request and simply check for the ctx.payload.hits.total (depending on your elasticsearch version you might need to check another field).

Also your condition should either be a script condition or a compare condition.

--Alex

Hello Alex,

Thank you for all the info. How can i transform the input section in order to use a range query as part of the search request?

More specifically how can i add the following data filter that i want in the input section:

"transaction.duration.us":
{
"gte": 10000,
"lt": 10300
}

Thank you

That is how i am trying to create the alert now but still it fails.   How to transform it to work correctly?

 {
      "trigger": {
        "schedule": {
          "interval": "1m"
        }
      },
      "input": {
        "search": {
          "request": {
            "body": {
              "size": 10,
              "query": {
                "match_phrase": {
                "host.hostname": {
                  "query": "sag-prd-cas-022.sag.services"
                },
               "apm-*.transaction.duration.us": {
                  "gte": 10000,
                  "lt": 10300
                }
               }
              }
            },
            "indices": [
              "apm-*"
            ]
          }
        }
      },
      "condition": {
         "compare": {
          "ctx.payload.hits.total": {
            "gte": 0
          }
        }
      },
      "actions": {
        "send_email" : { 
        "email" : { 
          "to" : "<alexandros.ananikidis@sag-ag.ch>", 
          "subject" : "Watcher Notification", 
          "body" : "Alex testing- Duration 10 to 11 seconds" 
        }
      }
      }
    }

The error message mentions the following:

{
  "watch_id": "1eb8afbc-c25c-4b2f-ad35-d32264cbcfad",
  "node": "pWmcfHISSE29wi2TaAyfOg",
  "state": "failed",
  "user": "ext_anan",
  "status": {
"state": {
  "active": true,
  "timestamp": "2019-11-06T08:59:55.555Z"
},
"actions": {
  "send_email": {
    "ack": {
      "timestamp": "2019-11-06T08:59:55.555Z",
      "state": "awaits_successful_execution"
    }
  }
},
"execution_state": "failed",
"version": -1
  },
  "trigger_event": {
"type": "schedule",
"triggered_time": "2019-11-06T09:23:56.034Z",
"schedule": {
  "scheduled_time": "2019-11-06T09:23:55.583Z"
}
  },
  "input": {
"search": {
  "request": {
    "search_type": "query_then_fetch",
    "indices": [
      "apm-*"
    ],
    "rest_total_hits_as_int": true,
    "body": {
      "size": 10,
      "query": {
        "match_phrase": {
          "host.hostname": {
            "query": "sag-prd-cas-022.sag.services"
          },
          "apm-*.transaction.duration.us": {
            "gte": 10000,
            "lt": 10300
          }
        }
      }
    }
  }
}
  },
  "condition": {
"compare": {
  "ctx.payload.hits.total": {
    "gte": 10
  }
}
  },
  "metadata": {
"name": "Alex testing",
"xpack": {
  "type": "json"
}
  },
  "result": {
"execution_time": "2019-11-06T09:23:56.034Z",
"execution_duration": 0,
"input": {
  "type": "search",
  "status": "failure",
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "[match_phrase] query doesn't support multiple fields, found [host.hostname] and [apm-*.transaction.duration.us]",
        "line": 1,
        "col": 126
      }
    ],
    "type": "parsing_exception",
    "reason": "[match_phrase] query doesn't support multiple fields, found [host.hostname] and [apm-*.transaction.duration.us]",
    "line": 1,
    "col": 126
  },
  "search": {
    "request": {
      "search_type": "query_then_fetch",
      "indices": [
        "apm-*"
      ],
      "rest_total_hits_as_int": true,
      "body": {
        "size": 10,
        "query": {
          "match_phrase": {
            "host.hostname": {
              "query": "sag-prd-cas-022.sag.services"
            },
            "apm-*.transaction.duration.us": {
              "gte": 10000,
              "lt": 10300
            }
          }
        }
      }
    }
  }
},
"actions": []
  },
  "messages": [
"failed to execute watch input"
  ]
}

Thank you

you need to use a bool query and wrap the match phrase and apm filter into that.

Small hint: Don't start with the watch, but write a proper query first, that returns the data you need, only then try to integrate it into the watch.

Hi,

I would probably change your naming convention from "." to "_" as my condition script below isn't working with . but works with underscores.

Error you get when using "."

 "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "invalid sequence of tokens near ['.'].",
      "caused_by": {
        "type": "no_viable_alt_exception",
        "reason": null
      }

Once you have changed to underscore you can try:

Else, you can try with ".", but you will have to work on a different condition.

{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "apm-*"
        ],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-{{ctx.metadata.window_period}}"
                    }
                  }
                },
                {
                  "term": {
                    "host_hostname": "sag-prd-cas-022.sag.services"
                  }
                },
                {
                  "range": {
                    "transaction_duration_us": {
                      "gte": 10000,
                      "lt": 10300
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "transaction_name": {
              "terms": {
                "field": "transaction_name"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": """
         def offenders = [];
           for (def transaction_name: ctx.payload.aggregations.transaction.name.buckets) {
                if (transaction_name.doc_count >= 1) {
                offenders.add([
                  'execution_time' : ctx.trigger.triggered_time
                ]);
              }
            }
      ctx.payload.offenders = offenders;
      return offenders.size() > 0;
""",
      "lang": "painless"
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "to": "<alexandros.ananikidis@sag-ag.ch>",
        "subject": "Watcher Notification",
        "body": "Alex testing- Duration 10 to 11 seconds"
      }
    }
  },
  "metadata": {
    "window_period": "1m"
  },
  "throttle_period_in_millis": 120000
}

This should query every minute, over the past 1 minute of log data, find all transactions from that hostname, between 10000 and 10300 (time) duration and execute, grouping by transaction name.

Hello Alex and Jason,

Thank both you a lot for your help:
After your recommendations i tried the following simple script:

{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "apm-*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "host_hostname": "sag-prd-cas-022.sag.services"
                  }
                },
                {
                  "range": {
                    "transaction_duration_us": {
                      "gte": 10000,
                      "lt": 10100
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gte": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "to": [
          "<alexandros.ananikidis@sag-ag.ch>"
        ],
        "subject": "Watcher Notification",
        "body": {
          "text": "Alex testing- Duration 10 to 11 seconds Watch [{{ctx.metadata.name}}] has exceeded {{ctx.payload.hits.hits.0.transaction.name}} the threshold The following transaction took longer than 10 seconds "
        }
      }
    }
  }
}

Then i executed if from the execute watch API as it was suggested, in order to get more logs about the trigger results. The results are the following:

{
  "_id" : "1eb8afbc-c25c-4b2f-ad35-d32264cbcfad_f47ac44d-b7ec-4c89-817f-868b6194d32b-2019-11-07T08:57:25.471974Z",
  "watch_record" : {
    "watch_id" : "1eb8afbc-c25c-4b2f-ad35-d32264cbcfad",
    "node" : "IQhDHLCVRXCAKsejaLyT-w",
    "state" : "executed",
    "user" : "ext_anan",
    "status" : {
      "state" : {
        "active" : true,
        "timestamp" : "2019-11-07T08:57:23.353Z"
      },
      "last_checked" : "2019-11-07T08:57:25.471Z",
      "last_met_condition" : "2019-11-07T08:57:25.471Z",
      "actions" : {
        "send_email" : {
          "ack" : {
            "timestamp" : "2019-11-07T08:57:23.353Z",
            "state" : "awaits_successful_execution"
          },
          "last_execution" : {
            "timestamp" : "2019-11-07T08:57:25.471Z",
            "successful" : false,
            "reason" : ""
          }
        }
      },
      "execution_state" : "executed",
      "version" : 71
    },
    "trigger_event" : {
      "type" : "manual",
      "triggered_time" : "2019-11-07T08:57:25.471Z",
      "manual" : {
        "schedule" : {
          "scheduled_time" : "2019-11-07T08:57:25.471Z"
        }
      }
    },
    "input" : {
      "search" : {
        "request" : {
          "search_type" : "query_then_fetch",
          "indices" : [
            "apm-*"
          ],
          "rest_total_hits_as_int" : true,
          "body" : {
            "size" : 0,
            "query" : {
              "bool" : {
                "filter" : [
                  {
                    "term" : {
                      "host_hostname" : "sag-prd-cas-022.sag.services"
                    }
                  },
                  {
                    "range" : {
                      "transaction_duration_us" : {
                        "gte" : 10000,
                        "lt" : 10100
                      }
                    }
                  }
                ]
              }
            }
          }
        }
      }
    },
    "condition" : {
      "compare" : {
        "ctx.payload.hits.total" : {
          "gte" : 0
        }
      }
    },
    "metadata" : {
      "name" : "Alextesting",
      "xpack" : {
        "type" : "json"
      }
    },
    "result" : {
      "execution_time" : "2019-11-07T08:57:25.471Z",
      "execution_duration" : 29,
      "input" : {
        "type" : "search",
        "status" : "success",
        "payload" : {
          "_shards" : {
            "total" : 105,
            "failed" : 0,
            "successful" : 105,
            "skipped" : 0
          },
          "hits" : {
            "hits" : [ ],
            "total" : 0,
            "max_score" : null
          },
          "took" : 20,
          "timed_out" : false
        },
        "search" : {
          "request" : {
            "search_type" : "query_then_fetch",
            "indices" : [
              "apm-*"
            ],
            "rest_total_hits_as_int" : true,
            "body" : {
              "size" : 0,
              "query" : {
                "bool" : {
                  "filter" : [
                    {
                      "term" : {
                        "host_hostname" : "sag-prd-cas-022.sag.services"
                      }
                    },
                    {
                      "range" : {
                        "transaction_duration_us" : {
                          "gte" : 10000,
                          "lt" : 10100
                        }
                      }
                    }
                  ]
                }
              }
            }
          }
        }
      },
      "condition" : {
        "type" : "compare",
        "status" : "success",
        "met" : true,
        "compare" : {
          "resolved_values" : {
            "ctx.payload.hits.total" : 0
          }
        }
      },
      "actions" : [
        {
          "id" : "send_email",
          "type" : "email",
          "status" : "failure",
          "error" : {
            "root_cause" : [
              {
                "type" : "general_script_exception",
                "reason" : "Error running com.github.mustachejava.codes.DefaultMustache@18275d08"
              }
            ],
            "type" : "general_script_exception",
            "reason" : "Error running com.github.mustachejava.codes.DefaultMustache@18275d08",
            "caused_by" : {
              "type" : "mustache_exception",
              "reason" : "Failed to get value for ctx.payload.hits.hits.0.transaction.name @[query-template:1]",
              "caused_by" : {
                "type" : "mustache_exception",
                "reason" : "0 @[query-template:1]",
                "caused_by" : {
                  "type" : "index_out_of_bounds_exception",
                  "reason" : "0"
                }
              }
            }
          }
        }
      ]
    },
    "messages" : [ ]
  }
}


As it can be seen i cannot access the transaction name correctly be cause i want to include it in my email:

"type" : "mustache_exception",
              "reason" : "Failed to get value for ctx.payload.hits.hits.0.transaction.name @[query-template:1]"

How can i correct that?
Mainly my big question also  is how can i see the structure of the ctx.payload columns, because i believe with that way it will be more easy to see how i can access its fields.

Thank you a lot in advance and sorry for the long post.

Best regards,
Alex

your condition is wrong as it also triggers when there are no hits being returned, as you are using gte aka greater than or equals. Either use gt as operator or 1 as value.

Hello Alex ,

Thank you i corrected that one to only use the gt as operator. Nevertheless i cant get the transaction name in my email message body. I use the  **ctx.payload.hits.hits.transaction.name** in order to get it but it fails. Do you know how to put it correctly?

{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "apm-*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "host_hostname": "sag-prd-cas-022.sag.services"
                  }
                },
                {
                  "range": {
                    "transaction_duration_us": {
                      "gte": 10000,
                      "lt": 11000
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "to": [
          "<alexandros.ananikidis@sag-ag.ch>"
        ],
        "subject": "Watcher Notification",
        "body": {
          "text": "Alex testing- Duration 10 to 11 seconds Watch [{{ctx.metadata.name}}] has exceeded {{ctx.payload.hits.hits.transaction.name}} the threshold The following transaction took longer than 10 seconds "
        }
      }
    }
  }
}

Here is the field name from the log it self.

Thank you once again

Hello Jason,

Thank you very much for your help. Just wanted to ask if the part

"range": {
"@timestamp": {
"gte": "now-{{ctx.metadata.window_period}}"
}
}

is necessary or not. If i dont include it since the alert is triggered every minute it will not check from now to the previous 60 seconds as a default option?

Hi,

If you don't include it then you will be searching all your data that matches the query every minute. I think it is necessary for this reason

Do you not want to be alerted only if any 'new' logs are matching your query?

Jason

Hello Jason.

Ofcourse you are right ... The really awkward situation now is that when i use the following code the field @timestamp is not recognised...how can that be...?

  {
  "trigger": {
"schedule": {
  "interval": "1m"
}
  },
  "input": {
"search": {
  "request": {
    "search_type": "query_then_fetch",
    "indices": [
      "apm-*"
    ],
    "rest_total_hits_as_int": true,
    "body": {
      "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": "now-1m"
                  
                }
              },
              "term": {
                "host.hostname": "sag-prd-cas-022.sag.services"
              }
            },
            {
              "range": {
                "transaction.duration.us": {
                  "gt": 1000
                }
              }
            }
          ]
        }
      }
    }
  }
}
  },
  "condition": {
"compare": {
  "ctx.payload.hits.total": {
    "gt": 0
  }
}
  },
  "actions": {
"send_email": {
  "email": {
    "profile": "standard",
    "to": [
      "<alexandros.ananikidis@sag-ag.ch>"
    ],
    "subject": "Watcher Notification",
    "body": {
      "text": "Alex testing- Duration 10 to 11 seconds Watch [{{ctx.metadata.name}}] has exceeded {{ctx.payload.hits.hits}} the threshold The following transaction took longer than 10 seconds "
    }
  }
}
  }
}
The answer that i get is the following (i also tried timestamp instead of@timestamp) but still the same error:

"root_cause": [
          {
            "type": "parsing_exception",
            "reason": "[range] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
            "line": 1,
            "col": 79 

Any ideas?
Thank you a lot again

When you create your index, what is your time filter field name?

Hello it is @timestamp.  I changed the code in the following form and finally it works:

{
  "trigger": {
"schedule": {
  "interval": "1m"
}
  },
  "input": {
"search": {
  "request": {
    "search_type": "query_then_fetch",
    "indices": [
      "apm-*"
    ],
    "rest_total_hits_as_int": true,
    "body": {
      "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": "now-1m"
                }
              }
            },
            {
              "term": {
                "host.hostname": "sag-prd-cas-022.sag.services"
              }
            },
            {
              "range": {
                "transaction.duration.us": {
                  "gte": 100000,
                  "lt": 103000
                }
              }
            }
          ]
        }
      }
    }
  }
}
  },
  "condition": {
"compare": {
  "ctx.payload.hits.total": {
    "gt": 0
  }
}
  },
  "actions": {
"send_email": {
  "email": {
    "profile": "standard",
    "to": [
      "<alexandros.ananikidis@sag-ag.ch>"
    ],
    "subject": "Watcher Notification",
    "body": {
      "text": "Alex testing- Duration 10 to 10.3 seconds  Watch [{{ctx.metadata.name}}] has exceeded {{ctx.payload.hits.hits.transaction.name}} the threshold The following transaction  {{ctx.payload.hits.hits.transaction.name}}  took longer than 10 seconds and less than 10.3 "
    }
  }
}
  }
}

My only problem now is that i cant inlude in the body message  tha transaction name:

I use that part of the code for that:   

  "body": {
      "text": "Alex testing- Duration 10 to 10.3 seconds  Watch [{{ctx.metadata.name}}] has exceeded {{ctx.payload.hits.hits.transaction.name}} the threshold The following transaction  {{ctx.payload.hits.hits.transaction.name}}  took longer than 10 seconds and less than 10.3 "

But the  {{ctx.payload.hits.hits.transaction.name}}  field is not doing the job. How should i modify it?

Hi, maybe try this:

{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
         "apm-*"
        ],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-1m"
                    }
                  }
                },
                {
                  "term": {
                    "host.hostname": "sag-prd-cas-022.sag.services"
                  }
                },
                {
                  "range": {
                    "transaction.duration.us": {
                      "gte": 100000,
                      "lt": 103000
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "transaction.name": {
              "terms": {
                "field": "transaction.name"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": """
         def offenders = [];
          for (def transaction.name: ctx.payload.aggregations.transaction.name.buckets) {
              if (ctx.payload.hits.total > 0) {
                offenders.add([
                  'transaction.name': transaction.name.key
                ]);
              }
          }
          ctx.payload.offenders = offenders;
      return offenders.size() > 0;
""",
      "lang": "painless"
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "to": [
          "<alexandros.ananikidis@sag-ag.ch>"
        ],
        "subject": "Watcher Notification",
        "body": "The following transactions {{#toJson}}ctx.payload.offenders{{/toJson}} have exceeded the threshold and took between 10 and 10.3 seconds to complete"
      }
    }
  }
}

Thank you for your help...When i try to create the alert with the exact code that you propose i get the following syntax error:

Just realised it wont work for you based on a previous discovery I made. This should:

PUT _watcher/watch/Name_Of_Watcher

{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
         "apm-*"
        ],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-1m"
                    }
                  }
                },
                {
                  "term": {
                    "host.hostname": "sag-prd-cas-022.sag.services"
                  }
                },
                {
                  "range": {
                    "transaction.duration.us": {
                      "gte": 100000,
                      "lt": 103000
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "transaction.name": {
              "terms": {
                "field": "transaction.name"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "to": [
          "<alexandros.ananikidis@sag-ag.ch>"
        ],
        "subject": "Watcher Notification",
        "body": "The following transactions {{#toJson}}ctx.payload.aggregations.transaction.name{{/toJson}} have exceeded the threshold and took between 10 and 10.3 seconds to complete"
      }
    }
  }
}
1 Like

Jason you are the best. With small modification it finally worked. Thank you very much !!!!!!!!!!!!