Machine Learning on scripted field?

Hello,

Yes, creating a scripted_field must be part of the datafeed configuration (https://www.elastic.co/guide/en/x-pack/5.4/ml-put-datafeed.html).

To accomplish:

  1. Create an advanced job (only can do this technique on advanced jobs)
  2. Configure the job as you normally would. When creating detectors, instead of choosing an existing field, choose a name that we'll later assign to the script field. In this example, we are choosing the field name total_error_count which doesn't exist in our documents
  3. Once your job is configured as you like, go to the "Edit JSON" tab
  4. Append a new script_fields parameter inside the datafeed_config object. The syntax for script_fields is identical to that used by Elasticsearch. You can find more information on the syntax here. We'll add our total_error_count script field to the script_fields object. The script will do a simple addition of two fields in the document to produce a "total" error count:
  "datafeed_config": {
    "query": {
      "match_all": {}
    },
    "query_delay": "60s",
    "frequency": "150s",
    "scroll_size": 1000,
    "indexes": [
      "rally-2017"
    ],
    "types": [
      "metrics"
    ],
    "script_fields": {
      "total_error_count": {
        "script": {
          "lang": "painless",
          "inline": "doc['error_count'].value + doc['aborted_count'].value"
        }
      }
    }
  }

When done editing the JSON, you can verify the output of your script with the results with the "Data Preview" tab. When satisfied, press Save.

You'll notice that our detector referenced "total_error_count", which is generated at runtime by the script. Every time a document is loaded by Elasticsearch, the script is evaluated and its result outputted as a "virtual" field. This is then used by the ML job.

1 Like