ML - Updated analytics task state to [failed] with reason [Limit of total fields [1000] has been exceeded]

Hi,

I want to setup ML job with outlier detection for http statuses 4xx and 5xx. I use ELK stack version 7.15.1.

This is the json of ML job.

{
  "id": "ml-http-4xx",
  "create_time": 1636545614078,
  "version": "7.15.1",
  "description": "",
  "source": {
    "index": [
      "apm-*-transaction*"
    ],
    "query": {
      "bool": {
        "filter": [
          {
            "bool": {
              "should": [
                {
                  "range": {
                    "http.response.status_code": {
                      "gte": "400"
                    }
                  }
                }
              ],
              "minimum_should_match": 1
            }
          },
          {
            "bool": {
              "should": [
                {
                  "range": {
                    "http.response.status_code": {
                      "lte": "500"
                    }
                  }
                }
              ],
              "minimum_should_match": 1
            }
          }
        ]
      }
    }
  },
  "dest": {
    "index": "ml-http-4xx",
    "results_field": "ml"
  },
  "analysis": {
    "outlier_detection": {
      "compute_feature_influence": true,
      "outlier_fraction": 0.05,
      "standardization_enabled": true
    }
  },
  "analyzed_fields": {
    "includes": [
      "http.response.status_code"
    ],
    "excludes": []
  },
  "model_memory_limit": "55mb",
  "allow_lazy_start": false,
  "max_num_threads": 1
}

After starting the job, I received the error and job state is failed:

Updated analytics task state to [failed] with reason [Limit of total fields [1000] has been exceeded]

I use apm-*-transaction* index pattern, because when I used apm-*, I got exception due to metric mappings.

Any help? Thanks

  1. Are you sure you don't want to do anomaly detection instead of outlier detection? I know they sound similar, but anomaly detection jobs can also run on data continuously, including real-time.

  2. You say you want 4xx and 5xx codes but your query looks like >400 and <500 which would ONLY give you 4xx codes.

Hi Rich,

Yes, You're right.

  1. I made a mistake, it should be anomaly detection.
  2. I forgot add condition to select 5xx code.

I've tried anomaly detection and it works now.

Thanks.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.