Failed to execute watch transform (NoClassDefFoundError)

We are on ELK 2.2.0. I am trying to hunt a bug with watcher but I had no success in last days.

After a few successful executions the watcher keeps falling in following exception:

ScriptException[failed to run inline script [return ctx.payload.hits.hits.collect({ "${it.fields['@timestamp']}\t${it.fields.severity}\t${it.fields[
'source.syslog.host_from']}\t${it.fields.type}\t${it.fields.message}" }).join('\n')] using lang [groovy]]; nested: NoClassDefFoundError[sun/reflect/
MethodAccessorImpl]; nested: ClassNotFoundException[sun.reflect.MethodAccessorImpl];
        at org.elasticsearch.script.groovy.GroovyScriptEngineService$GroovyScript.run(GroovyScriptEngineService.java:318)
        at org.elasticsearch.watcher.transform.script.ExecutableScriptTransform.doExecute(ExecutableScriptTransform.java:73)
        at org.elasticsearch.watcher.transform.script.ExecutableScriptTransform.execute(ExecutableScriptTransform.java:59)
        at org.elasticsearch.watcher.transform.script.ExecutableScriptTransform.execute(ExecutableScriptTransform.java:40)
        at org.elasticsearch.watcher.execution.ExecutionService.executeInner(ExecutionService.java:378)
        at org.elasticsearch.watcher.execution.ExecutionService.execute(ExecutionService.java:273)
        at org.elasticsearch.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:438)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: sun/reflect/MethodAccessorImpl
        at sun.misc.Unsafe.defineClass(Native Method)
        at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
        at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
        at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:394)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:393)

I cannot see any difference in the data the watcher collects, it rather seems to be something under the hood.

The watcher conditions are set like this:

{
  "metadata": { "tags" : ["ito", "dcs"] },
  "trigger" : { "schedule" : { "interval" : "60s" } },
  "input" : {
    "search" : {
      "timeout" : "20s",
      "request" : {
        "indices" : [ "<logstash-app.dcs-{now/d{YYYY-MM-dd}}>", "<logstash-app.dcs-{now/d-1d{YYYY-MM-dd}}>" ],
        "body" : {
          "size" : 50,
          "query" : {
            "filtered": {
              "filter": {
                "bool": {
                  "must": [
                    {
                      "range": {
                        "@timestamp": {
                          "gte": "{{ctx.trigger.scheduled_time}}||-1m",
                          "lte": "{{ctx.trigger.scheduled_time}}"
                        }
                      }
                    },
                    {
                      "term": {
                        "source.syslog.file_name": "ito"
                      }
                    }
                  ]
                }
              }
            }
          },
          "fields": ["@timestamp","severity","type","message", "source.syslog.host_from"],
          "sort": [
            {
              "@timestamp": {
                "order": "asc"
              }
            }
          ]
        }
      }
    }
  },
  "condition" : {
    "compare" : { "ctx.payload.hits.total" : { "gt" : 0 }}
  },
  "transform" : {
    "script" : "return ctx.payload.hits.hits.collect({ \"${it.fields.severity}\\t${it.fields['source.syslog.host_from']}\\t${it.fields.type}\\t${it.fields.message}\" }).join('\\n')"
  },
  "actions" : {
    "log_error" : {
      "logging" : {
        "category" : "watcher.actions.logging.ito",
        "text" : "Found ito items\n{{ctx.payload._value}}"
      }
    }
  }
}

I'll really appreciate any help.

Hey,

can you provide a fully fledged example here? It is hard to reproduce without having any actual documents. Also you might want to use the Execute API to actually see what data is being returned. I assume that by specifying fields in your search request, you do not have access to the _source anymore, as it is not part of the response.

I came up with this working example, but it might not be, what you are having

PUT test/test/1 
{
  "severity" : "TRACE",
  "type": "some_type",
  "message" : "Nice Message, duuuude!",
  "syslog" : {
    "host_from" : "test_host_from"
  }
}

PUT test/test/2
{
  "severity" : "DEBUG",
  "type": "another_type",
  "message" : "The Big Lebowski",
  "syslog" : {
    "host_from" : "another_host_from"
  }
}

PUT _watcher/watch/test
{
  "trigger" : { "schedule" : { "interval" : "10s" } },
  "input" : {
    "search" : {
      "request" : {
        "indices" : [ "test" ],
        "body" : {
          "query" : {
            "match_all": {}
          }
        }
      }
    }
  },
  "condition" : {
    "compare" : { "ctx.payload.hits.total" : { "gt" : 0 }}
  },
  "transform" : {
    "script" : "return ctx.payload.hits.hits.collect({ \"${it._source.severity}\\t${it._source.syslog.host_from}\\t${it._source.type}\\t${it._source.message}\" }).join('\\n')"
  },
  "actions" : {
    "log_error" : {
      "logging" : {
        "category" : "watcher.actions.logging.ito",
        "text" : "Found ito items\n{{ctx.payload._value}}"
      }
    }
  }
}

PUT _watcher/watch/test/_execute

Take a special look at the search result being returned as part of the last API call, that might help you debug further issues.

--Alex

Thanks for the _execute tip, after few successful calls it brought the error too. I will try your example shortly

Yes only fields, no _source. I will try to put back _source too.

Here is the result from the failed _exuecute call:
https://drive.google.com/open?id=0B1_N4nD6iVuzVTFRN0dWYjFlcUk

Actually I got watch executions from the .watch_history index too if it will be helpful:
2016-02-09, 09:41:47.381 watcher_failed.json
2016-02-09, 09:40:47.193 watcher_execution_not_needed.json
2016-02-09, 09:39:47.503 watcher_executed.json

Hey,

again: I think one of your core issues is using source/fields in combination. Try to stick to one. I would not use the fields notation, as mentioned in the elasticsearch documentation, and stick with source filtering, if you need to reduce the response data.

--Alex

removing fields did not help, furthermore and unfortunately even on your example it starts failing, after 3 success executions I got following

{
  "_id": "test_3-2016-02-09T12:02:19.421Z",
  "watch_record": {
    "watch_id": "test",
    "state": "failed",
    "trigger_event": {
      "type": "manual",
      "triggered_time": "2016-02-09T12:02:19.421Z",
      "manual": {
        "schedule": {
          "scheduled_time": "2016-02-09T12:02:19.421Z"
        }
      }
    },
    "input": {
      "search": {
        "request": {
          "search_type": "query_then_fetch",
          "indices": [
            "test"
          ],
          "types": [],
          "body": {
            "query": {
              "match_all": {}
            }
          }
        }
      }
    },
    "condition": {
      "compare": {
        "ctx.payload.hits.total": {
          "gt": 0
        }
      }
    },
    "messages": [
      "failed to execute watch transform"
    ],
    "result": {
      "execution_time": "2016-02-09T12:02:19.421Z",
      "execution_duration": 211,
      "input": {
        "type": "search",
        "status": "success",
        "payload": {
          "_shards": {
            "total": 5,
            "failed": 0,
            "successful": 5
          },
          "hits": {
            "hits": [
              {
                "_index": "test",
                "_type": "test",
                "_source": {
                  "severity": "DEBUG",
                  "syslog": {
                    "host_from": "another_host_from"
                  },
                  "type": "another_type",
                  "message": "The Big Lebowski"
                },
                "_id": "2",
                "_score": 1
              },
              {
                "_index": "test",
                "_type": "test",
                "_source": {
                  "severity": "TRACE",
                  "syslog": {
                    "host_from": "test_host_from"
                  },
                  "type": "some_type",
                  "message": "Nice Message, duuuude!"
                },
                "_id": "1",
                "_score": 1
              }
            ],
            "total": 2,
            "max_score": 1
          },
          "took": 209,
          "timed_out": false
        },
        "search": {
          "request": {
            "search_type": "query_then_fetch",
            "indices": [
              "test"
            ],
            "types": [],
            "template": {
              "template": {
                "query": {
                  "match_all": {}
                }
              },
              "params": {
                "ctx": {
                  "metadata": null,
                  "watch_id": "test",
                  "id": "test_3-2016-02-09T12:02:19.421Z",
                  "trigger": {
                    "triggered_time": "2016-02-09T12:02:19.421Z",
                    "scheduled_time": "2016-02-09T12:02:19.421Z"
                  },
                  "vars": {},
                  "execution_time": "2016-02-09T12:02:19.421Z"
                }
              }
            }
          }
        }
      },
      "condition": {
        "type": "compare",
        "status": "success",
        "met": true,
        "compare": {
          "resolved_values": {
            "ctx.payload.hits.total": 2
          }
        }
      },
      "transform": {
        "type": "script",
        "status": "failure",
        "reason": "ScriptException[failed to run inline script [return ctx.payload.hits.hits.collect({ \"${it._source.severity}\t${it._source.syslog.host_from}\t${it._source.type}\t${it._source.message}\" }).join('\n')] using lang [groovy]]; nested: NoClassDefFoundError[sun/reflect/MethodAccessorImpl]; nested: ClassNotFoundException[sun.reflect.MethodAccessorImpl]; "
      },
      "actions": []
    }
  }
}

Small note, we are runing on
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)

Hey,

I can reproduce this, thats a weird one. I'll try to dig into it...

--Alex

Hey,

turns out that this is a bug, we are investigating, feel free to track the issue in elasticsearch core

--Alex

Hi Alex,

thanks a lot! It was really fast to find the cause and fix. I am already looking forward the 2.2.1 release :).

Lukas

Hey,

in the meantime, you can rewrite your code, so it does not use closures, so everything should be fine.

--Alex