Machine learning Alerts are failing for java error

machine-learning

(Dinesh Senthil Kumar) #1

HI,

i am using the latest 6.3 elastic stack with 30 day trial license to explore the machine learning features. i am using metricbeat to monitor my infra and using machine learning to identify potential bottle necks and cpu anamolies, i have set a watcher to send emails when there is an anamoly. my watcher seems failing with the below error

"[2018-07-05T10:01:31,373][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [myinfra] faile
d to execute [search] input for watch [ml-cpu], reason [cannot write xc
ontent for unknown value of type class java.time.LocalDateTime]"


(Alexander Reelsen) #2

Hey,

can you share full watch as well as an execute watch API output?

Thanks!

--Alex


(Dinesh Senthil Kumar) #3

Hi Here is my watcher script, i have removed the email part to save space

{
  "trigger": {
    "schedule": {
      "interval": "69s"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          ".ml-anomalies-*"
        ],
        "types": [],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "job_id": "cpuanlyzer"
                  }
                },
                {
                  "range": {
                    "timestamp": {
                      "gte": "now-20h"
                    }
                  }
                },
                {
                  "terms": {
                    "result_type": [
                      "bucket",
                      "record",
                      "influencer"
                    ]
                  }
                }
              ]
            }
          },
          "aggs": {
            "bucket_results": {
              "filter": {
                "range": {
                  "anomaly_score": {
                    "gte": 0
                  }
                }
              },
              "aggs": {
                "top_bucket_hits": {
                  "top_hits": {
                    "sort": [
                      {
                        "anomaly_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": {
                      "includes": [
                        "job_id",
                        "result_type",
                        "timestamp",
                        "anomaly_score",
                        "is_interim"
                      ]
                    },
                    "size": 1,
                    "script_fields": {
                      "start": {
                        "script": {
                          "lang": "painless",
                          "inline": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].date.getMillis()-((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC)",
                          "params": {
                            "padding": 10
                          }
                        }
                      },
                      "end": {
                        "script": {
                          "lang": "painless",
                          "inline": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].date.getMillis()+((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC)",
                          "params": {
                            "padding": 10
                          }
                        }
                      },
                      "timestamp_epoch": {
                        "script": {
                          "lang": "painless",
                          "inline": "doc[\"timestamp\"].date.getMillis()/1000"
                        }
                      },
                      "timestamp_iso8601": {
                        "script": {
                          "lang": "painless",
                          "inline": "doc[\"timestamp\"].date"
                        }
                      },
                      "score": {
                        "script": {
                          "lang": "painless",
                          "inline": "Math.round(doc[\"anomaly_score\"].value)"
                        }
                      }
                    }
                  }
                }
              }
            },
            "influencer_results": {
              "filter": {
                "range": {
                  "influencer_score": {
                    "gte": 3
                  }
                }
              },
              "aggs": {
                "top_influencer_hits": {
                  "top_hits": {
                    "sort": [
                      {
                        "influencer_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": {
                      "includes": [
                        "result_type",
                        "timestamp",
                        "influencer_field_name",
                        "influencer_field_value",
                        "influencer_score",
                        "isInterim"
                      ]
                    },
                    "size": 3,
                    "script_fields": {
                      "score": {
                        "script": {
                          "lang": "painless",
                          "inline": "Math.round(doc[\"influencer_score\"].value)"
                        }
                      }
                    }
                  }
                }
              }
            },
            "record_results": {
              "filter": {
                "range": {
                  "record_score": {
                    "gte": 3
                  }
                }
              },
              "aggs": {
                "top_record_hits": {
                  "top_hits": {
                    "sort": [
                      {
                        "record_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": {
                      "includes": [
                        "result_type",
                        "timestamp",
                        "record_score",
                        "is_interim",
                        "function",
                        "field_name",
                        "by_field_value",
                        "over_field_value",
                        "partition_field_value"
                      ]
                    },
                    "size": 3,
                    "script_fields": {
                      "score": {
                        "script": {
                          "lang": "painless",
                          "inline": "Math.round(doc[\"record_score\"].value)"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.aggregations.bucket_results.doc_count": {
        "gt": 0
      }
    }
  },
  "actions": {
    "log": {
      "logging": {
        "level": "info",
        "text": "Alert for job [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0._source.job_id}}] at [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.timestamp_iso8601.0}}] score [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.score.0}}]"
      }
    },

(Dinesh Senthil Kumar) #4
execute api output

{
"statusCode": 504,
"error": "Gateway Time-out",
"message": "Client request timeout"
}


(Alexander Reelsen) #5

Ok, the execute watch API output would be great, but we can also get some watch history output,

can you run (and likely put it into a gist, as it will contain 10 executions which hopefully contain stack traces)

GET .watcher-history-*/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "watch_id": "ml-cpu"
          }
        }
      ]
    }
  },
  "sort": [
    {
      "trigger_event.triggered_time": {
        "order": "desc"
      }
    }
  ]
}

(Alexander Reelsen) #6

Hey,

did you really paste the complete watch up there? I see a comma as the last character and would like to verify that. I am especially interested if you are using a transform somewhere.

Thanks!

--Alex


(Alexander Reelsen) #7

Hey,

can you append .toString() to the two start and end inline scripts in that watch and retry? I think that is the current issue.

--Alex


(Dinesh Senthil Kumar) #8

HI Alex,

tried something like this, its is throwing a unexpected failure, pardon my ignorance if the syntax is wrong.

"start": {
"script": {
"lang": "painless",
"inline": ".toString(LocalDateTime.ofEpochSecond((doc["timestamp"].date.getMillis()-((doc["bucket_span"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC))",
"params": {
"padding": 10
}
}
},
"end": {
"script": {
"lang": "painless",
"inline": ".toString(LocalDateTime.ofEpochSecond((doc["timestamp"].date.getMillis()+((doc["bucket_span"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC))",
"params": {
"padding": 10
}
}
},


(rich collier) #9

No, it would be:

              "start": {
                "script": {
                  "lang": "painless",
                  "source": """
LocalDateTime.ofEpochSecond((doc["timestamp"].date.getMillis()-((doc["bucket_span"].value * 1000) * params.padding)) / 1000, 0, ZoneOffset.UTC).toString()
""",
                  "params": {
                    "padding": 10
                  }
                }
              },
              "end": {
                "script": {
                  "lang": "painless",
                  "source": """
LocalDateTime.ofEpochSecond((doc["timestamp"].date.getMillis()+((doc["bucket_span"].value * 1000) * params.padding)) / 1000, 0, ZoneOffset.UTC).toString()
""",
                  "params": {
                    "padding": 10
                  }
                }

Watcher is not generating Email during anomaly
(Dinesh Senthil Kumar) #10

thanks Rich and alex, the watcher is working post making the above code change.
can we reconcile this change into the default watches too.


(rich collier) #11

Track the issue here: https://github.com/elastic/elasticsearch/issues/31853


(Mark Walkom) #12