[6.2.2] datafeed not working with scripted field because mapping issue

machine-learning

(olivier hodac) #1

Hello,

I run some ML jobs on scripted fields. I use a regex for the index pattern that matches several indices. In some cases, the fields involved in the scripted field are not mapped. Up to now, it was not a big deal, and the jobs did work fine.

Now, I create a new job with the same look and it fails with:

Datafeed is encountering errors extracting data: [toto] Search request returned shard failures; first failure: shard [[EO6_giUtTcSJysS_yM3Wqw][wilco__t1__12035538__2017__v2][0]], reason [RemoteTransportException[[wilco-3][10.91.216.75:9300][indices:data/read/search[phase/query]]]; nested: ScriptException[link error]; nested: NotSerializableExceptionWrapper[parse_exception: Field [MES1_APUEGT] does not exist in mappings]; ]; see logs for more info

The one which works

This job works fine:

{
...  "analysis_config": {
    "bucket_span": "1d",
    "detectors": [
      {
        "detector_description": "high_mean(NOLOAD_EGT_CORRECTED)",
        "function": "high_mean",
        "field_name": "NOLOAD_EGT_CORRECTED",
        "partition_field_name": "fwot.keyword",
        "rules": []
      }
    ],
    "influencers": [
      "deident.keyword",
      "fwot.keyword"
    ]
  },
...
  "datafeed_config": {
    "indices": [
      "wilco*__t2__*"
    ],
    "types": [],
    "query": {
      "bool": {
        "must": [
//same as for the job
        ],
        "adjust_pure_negative": true,
        "boost": 1
      }
    },
    "script_fields": {
      "NOLOAD_EGT_CORRECTED": {
        "script": {
          "source": "doc['NOLOAD_EGT_SEL'].value*doc['NOLOAD_TAMB'].value",
          "lang": "expression"
        },
        "ignore_failure": false
      }
    },
    "chunking_config": {
      "mode": "auto"
    }
  }
}

even if

GET wilco__t2*/_search
{
  "size": 1,
"query": {
      "bool": {
        "must": [
          {
            "range": {
              "NOLOAD_EGT_SEL": {
                "from": 10,
                "to": 1000,
                "include_lower": true,
                "include_upper": false,
                "boost": 1
              }
            }
          },
          {
            "exists": {
              "field": "NOLOAD_EGT_SEL",
              "boost": 1
            }
          }
        ],
        "adjust_pure_negative": true,
        "boost": 1
      }
    },
    "script_fields": {
      "NOLOAD_EGT_CORRECTED": {
        "script": {
          "source": "doc['NOLOAD_EGT_SEL'].value*doc['NOLOAD_TAMB'].value",
          "lang": "expression"
        },
        "ignore_failure": false
      }
    }
}

returns

{
  "took": 64,
  "timed_out": false,
  "_shards": {
    "total": 43,
    "successful": 4,
    "skipped": 0,
    "failed": 39,
    "failures": [
      {
        "shard": 0,
        "index": "wilco__t2__1607933__2015__v2",
        "node": "K-B_fm1aQe29-eaWxtu8dQ",
        "reason": {
          "type": "script_exception",
          "reason": "link error",
          "script_stack": [
            "doc['NOLOAD_EGT_SEL'].value",
            "     ^---- HERE"
          ],
          "script": "doc['NOLOAD_EGT_SEL'].value*doc['NOLOAD_TAMB'].value",
          "lang": "expression",
          "caused_by": {
            "type": "parse_exception",
            "reason": "Field [NOLOAD_EGT_SEL] does not exist in mappings"
          }
        }
      }
    ]
  },
  "hits": {
    "total": 32047,
    "max_score": 2,
    "hits": [
      {
        "_index": "wilco__t2__3866687__2016__v2",
        "_type": "doc",
        "_id": "sezdn_20160902t080646",
        "_score": 2,
        "fields": {
          "NOLOAD_EGT_CORRECTED": [
            324.966627138743
          ]
        }
      }
    ]
  }
}

The one who fails

The other which is exactly the same with other indices and fields fails:

job:

{
  ...
  "analysis_config": {
    "bucket_span": "1d",
    "detectors": [
      {
        "detector_description": "high_mean(MES1_EGT_CORRECTED)",
        "function": "high_mean",
        "field_name": "MES1_EGT_CORRECTED",
        "partition_field_name": "fwot.keyword",
        "rules": []
      }
    ],
    "influencers": [
      "fwot.keyword"
    ]
  },
...  "datafeed_config": {
    "query_delay": "3h",
    "frequency": "1d",
    "indices": [
      "wilco__t1__*"
    ],
    "types": [],
    "query": {
      "bool": {
        "must": [
          {
            "range": {
              "MES1_APUEGT": {
                "from": 10,
                "to": 1000,
                "include_lower": true,
                "include_upper": false,
                "boost": 1
              }
            }
          },
          {
            "exists": {
              "field": "MES1_APUEGT",
              "boost": 1
            }
          },
          {
            "exists": {
              "field": "APUIT",
              "boost": 1
            }
          }
        ],
        "adjust_pure_negative": true,
        "boost": 1
      }
    },
    "script_fields": {
      "MES1_EGT_CORRECTED": {
        "script": {
          "source": "((doc['MES1_APUEGT'].value*doc['APUIT'].value",
          "lang": "expression"
        },
        "ignore_failure": true
      }
    },
    "chunking_config": {
      "mode": "auto"
    }
  }
}

query:

GET wilco__t1*/_search
{
  "size": 1,
  "query": {
      "bool": {
        "must": [
//same as for the job
        ],
        "adjust_pure_negative": true,
        "boost": 1
      }
    },
    "script_fields": {
      "MES1_EGT_CORRECTED": {
        "script": {
          "source": "((doc['MES1_APUEGT'].value*doc['APUIT'].value",
          "lang": "expression"
        },
        "ignore_failure": false
      }
    }
}

result:

{
  "took": 44,
  "timed_out": false,
  "_shards": {
    "total": 45,
    "successful": 10,
    "skipped": 0,
    "failed": 35,
    "failures": [
      {
        "shard": 0,
        "index": "wilco__t1__12035538__2017__v2",
        "node": "EO6_giUtTcSJysS_yM3Wqw",
        "reason": {
          "type": "script_exception",
          "reason": "link error",
          "script_stack": [
            "doc['MES1_APUEGT'].value",
            "     ^---- HERE"
          ],
          "script": "((doc['MES1_APUEGT'].value*doc['APUIT'].value",
          "lang": "expression",
          "caused_by": {
            "type": "parse_exception",
            "reason": "parse_exception: Field [MES1_APUEGT] does not exist in mappings"
          }
        }
      }
    ]
  },
  "hits": {
    "total": 6147,
    "max_score": 3,
    "hits": [
      {
        "_index": "wilco__t1__589__2010__v2",
        "_type": "doc",
        "_id": "fw-tst_20100401t102205",
        "_score": 3,
        "fields": {
          "MES1_EGT_CORRECTED": [
            790.5875641252434
          ]
        }
      }
    ]
  }
}

(olivier hodac) #2

In fact, it is the reason why it works even if the mapping is not set that is very surprising....


(rich collier) #3

"source": "((doc['MES1_APUEGT'].value*doc['APUIT'].value"

isn't valid syntax with the mismatched parentheses.


(system) closed #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.