Is it possible to choose the fields from the event before passing it for analysis?

Is it possible to limit the number of fields we process?
In anomaly detection->Configure datafeed the elasticsearch query is :

{
  "bool": {
    "must": [
      {
        "match_all": {}
      }
    ]
  }
}

I tried to limit the fields using the _source trick but it gave syntax errors.

"_source": [
    "field1",
    "field2",
    "field3"
  ],
{
  "bool": {
    "must": [
      {
        "match_all": {}
      }
    ]
  }
}

I am not even sure if this is a right thing to do also. The fields I am trying to remove have same value over all the records. So I figured that it might be better to remove them altogether. And choose the ones which vary.

You don't need to do this - the datafeed does it for you.

The number of fields that the job has to process depends on the job config. Each field that's mentioned in the job config has to be retrieved, so the only way to reduce the number of fields the datafeed has to retrieve is to mention fewer fields in the job config.

For each field that the job requires the datafeed decides whether it will get the field from doc values or from _source. Then it requests the minimum amount of doc values and _source fields to satisfy those requirements. If it decides to get everything from doc values then it even switches off fetching of _source altogether.

The relevant code is here if you want to confirm.

That sorts it for me. Thanks for the confirmation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.