Elasticsearch machine learning datafeed configuration error

I'm trying to alter datafeed for ml jobs but the same errror keep occurs.
What I want to do is alter data like 'abc/def/ghi' into 'abc' so based on this i wrote script as follows

  "runtime_mappings": {
    "my_runtime_field": {
      "type": "keyword",
      "script": {
        "source": "def m = /^\/(\w+|_)\//.matcher(doc['tokenstring3'].value); emit(m.find() ? m.group(1): '');"
      }
    }
  }

But every time I try to write any escape letter, automatically two sets of quotation marks are added around my code, making it like this.

      "script": {
        "source": """def m = /^\/(\w+|_)\//.matcher(doc['tokenstring3'].value); emit(m.find() ? m.group(1): ''"");

And when i delete the quotation marks, the screen goes black.

Does anyone know how to fix this error?

@zerojin63 I have been able to configure a runtime field similar to the one you are trying to use in your datafeed. The double quotes at the end of the second snippet you posted just need editing.

I am using the Advanced wizard in the Machine Learning page in Kibana, and then edit the JSON used in the datafeed to this for the data view (index pattern) I am using:

"runtime_mappings": {
    "uri_path_first": {
      "type": "keyword",
      "script": {
        "source": """def m = /^\/(\w+|_)\//.matcher(doc['uri'].value); emit(m.find() ? m.group(1): '');"""
      }
    }
  }

You can see in the datafeed preview that the runtime field is getting populated (most of my entries are empty, but there are non-empty ones too), and I can use the field in a job:

Note we have an issue open to switch to a new JSON editor throughout the Machine Learning UI, which would allow you to define the script using the same syntax you posted in your first snippet. Depending on which version of the stack you have, you can also define this field in the data view (index pattern) itself, using that simpler syntax, and it will be available to use in the anomaly detection job.

Hope this works for you.
Pete

1 Like

The first one (datafeed configuration) still didn't work but the second one did!

This is the code I used.

if (doc["field_name.keyword"].size()!=0){
    def path = doc["field_name.keyword"].value;
    if (path != null) {
        int firstSlashIndex = path.indexOf('/');
        if (firstSlashIndex > 0) {
            emit(path.substring(0,firstSlashIndex));
        return;
        }
    }
}
emit("");

Thank you so much for your help :+1:

1 Like

I've run into another problem..

An error occurs on job validation process when I use the runtime field I added as a detector field on anomaly detection job.

This is the error that keeps occuring.

이미지

I've checked that the runtime field I added works fine on 'Discover' section.
I've also checked that if I use other fields (the original fields, except for the one I added) as detector, no error occurs.
So it must have something to do with rhe field I added.. But I don't get what's wrong with it.
Do you happen to have any idea regarding this situation?

Apologies for the delay in replying @zerojin63. The error you are seeing when trying to use the field in an ML job looks like it is being caused by the datafeed preview step which validates the job configuration taking too long. The fact that the field shows up in Discover would imply that the configuration of the field is ok.

What version of the stack do you have, and which of the anomaly detection job wizards are you using? If you use the advanced job wizard, add the detector using the runtime field, then hit 'Edit JSON' and take a look in the datafeeed preview panel. Do you see values for your runtime field in there, like I see here:

You could also create the job directly in Kibana Dev Tools, and try running the datafeed preview from there. For example, with my config:

PUT _ml/anomaly_detectors/test1
{
  "analysis_config": {
    "bucket_span": "4h",
    "detectors": [
      {
        "function": "sum",
        "field_name": "bytes",
        "partition_field_name": "uri_first_path",
        "detector_description": "sum(bytes) partitionfield=uri_first_path"
      }
    ],
    "influencers": [
      "uri_first_path"
    ]
  },
  "data_description": {
    "time_field":"@timestamp"
  },
    "datafeed_config":{
    "datafeed_id": "datafeed-test1",
    "indices": ["gallery-*"],
    "runtime_mappings": {
      "uri_first_path": {
        "type": "keyword",
        "script": {
          "source": """def m = /^\/(\w+|_)\//.matcher(doc['uri'].value); emit(m.find() ? m.group(1): '');"""
        }
      }
    }
  }
}

GET _ml/datafeeds/datafeed-test1/_preview

Thanks for your reply.
Here's the answer to your questions.

  1. What version of the stack do you have?
    -> I'm using v 7.17.3
  2. Which of the anomaly detection job wizatd I am using?
    -> I used advanced job as well. Also I was able to see values for my runtime field.

So I don't think there's anything wrong with my runtime field script, but still the error occurs on job validation process.
I'm using 'rare' as my anomaly detection job detector. Can this be the reason for this error?

You shouldn't have any problems using the runtime field in a 'rare' detector. I have stepped through the advanced wizard and the validation step succeeded. My job and datafeed config and preview look like this (I have two runtime fields defined in my data view, but the job is only using one of them):

Are you able to share your job and datafeed config?

Did you try creating the job and running the datafeed preview inside Kibana Dev Tools as shown above?