Stuck on [Bucket max] cardinality estimate required for [influencers] [response_time_cat] but not supplied

Hi team,

I have created a runtime field (categorical field) in the Datafeed of ml job to use it as an influencer for the anomalies. I am able to find the field in the influencer section but after that when I tried to finish the job, getting the below errors one after another --

[request body.datafeedConfig.fields]: definition for this key is missing

{
  "statusCode": 400,
  "error": "Bad Request",
  "message": "illegal_argument_exception",
  "attributes": {
    "body": {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "[Bucket max] cardinality estimate required for [influencers] [response_time_cat] but not supplied"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "[Bucket max] cardinality estimate required for [influencers] [response_time_cat] but not supplied"
      },
      "status": 400
    }
  }
}

See below the datafeed --

{
  "datafeed_id": "",
  "job_id": "",
  "indices": [
    "xyz"
  ],
  "runtime_mappings": {
    "response_time_cat": {
      "type": "keyword",
      "script": {
        "source": """if (doc['response_time'].value >= 0 && doc['response_time'].value < 30) {emit("low");} 
		else if (doc['response_time'].value >= 31 && doc['response_time'].value < 60) {emit("medium");} 
		else if (doc['response_time'].value >= 61) {emit("high");} 
		else{emit("unknown");}"""
      }
    }
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_all": {}
        },
        {
          "match_phrase": {
            "part.keyword": "ABC"
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  },
  "fields": [
    {
      "field": "response_time_cat"
    }
  ],
  "query_delay": "120s"
}

ELK version 7.11.1
Please help me to resolve the issue.

Regards.
Souvik

Are you using the UI or APIs (e.g. in DevTools) to create the job?
Please also include the job config.
Could you see anything via the preview tab?

Hi @sophie_chang ,

Thanks for your quick response.

I have used "Advanced Configuration Wizard" for this purpose. Also seen the output of the datafeed in the preview window. See below --

image

See below the job config --

{
  "job_id": "test_and_delete_runtime_field_added",
  "description": "",
  "groups": [],
  "analysis_config": {
    "bucket_span": "4h",
    "detectors": [
      {
        "function": "low_count"
      }
    ],
    "influencers": [
      "response_time_cat"
    ]
  },
  "data_description": {
    "time_field": "response_datetime"
  },
  "analysis_limits": {
    "model_memory_limit": "10MB"
  },
  "model_plot_config": {
    "enabled": true,
    "annotations_enabled": true
  }
}

Am I missing something while configuring?

For your use case, I would expect you would get better results by analysing the average response_time ..

The analysis you are doing now will detect low event rate of docs (low_count) - although I do not know the shape of your data, it seems more likely that the value of response_time is an indicator of slow responses, rather than the event rate of documents. Just a thought.

Putting that aside, it appears that the adv wizard is failing at the validation step. Unfortunately this maybe a bug in the UI as you are using quite an old version.

To workaround the UI validations (without upgrading) I recommend that you create the job using Dev Tools. The following creates the job without errors in 8.4 but may need adjusting for 7.11.1

PUT _ml/anomaly_detectors/test_and_delete_runtime_field_added
{
  "description": "",
  "groups": [],
  "analysis_config": {
    "bucket_span": "4h",
    "detectors": [
      {
        "function": "low_count"
      }
    ],
    "influencers": [
      "response_time_cat"
    ]
  },
  "data_description": {
    "time_field": "response_datetime"
  },
  "analysis_limits": {
    "model_memory_limit": "10MB"
  },
  "model_plot_config": {
    "enabled": true,
    "annotations_enabled": true
  }
}


PUT _ml/datafeeds/datafeed-test_and_delete_runtime_field_added
{
  "job_id": "test_and_delete_runtime_field_added",
  "indices": [
    "xyz"
  ],
  "runtime_mappings": {
    "response_time_cat": {
      "type": "keyword",
      "script": {
        "source": """if (doc['response_time'].value >= 0 && doc['response_time'].value < 30) {emit("low");} 
		else if (doc['response_time'].value >= 31 && doc['response_time'].value < 60) {emit("medium");} 
		else if (doc['response_time'].value >= 61) {emit("high");} 
		else{emit("unknown");}"""
      }
    }
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_all": {}
        },
        {
          "match_phrase": {
            "part.keyword": "ABC"
          }
        }
      ]
    }
  },
  "query_delay": "120s"
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.