Unable to load APM Errors due to new error: Fielddata is disabled on [error.grouping_key]

I realize there are a fair number of other topics with a very similar error, but this feels different due to the fact that it's the Observability APM that is triggering this exception. I just went to my APM in Elastic Cloud to view errors for a few of my services but am unable to load the list due to this error. These logs are coming from the NodeJS APM and RUM clients directly (for my server and client apps) and are both throwing the same error. What would suddenly cause this? I also checked and there is a error.grouping_key.keyword so why isn't APM using it? I would understand if this was data coming from Logstash and I had a bad index, but all of this (I thought) was managed by the APM plugin and APM packages. Additionally, this was working previously.

System:
Host - ElasticCloud
Elasticsearch version - 8.6.2

Error
search_phase_execution_exception: [illegal_argument_exception] Reason: Fielddata is disabled on [error.grouping_key] in [.ds-logs-apm.error-default-2023.03.25-000002]. Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [error.grouping_key] in order to load field data by uninverting the inverted index. Note that this can use significant memory. (500)

I also checked the mapping for .ds-logs-apm and found this:

"grouping_key": {
  "type": "text",
  "fields": {
    "keyword": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
},

@pocketcolin Asking around, this sounds like a mapping issue in your Elasticsearch index/data-stream. However, I'm not sure how that happened. I believe it is unexpected that your mapping for "error.grouping_key" is of type: "text".

First, let's make sure you are looking at the correct index.

  1. I found a document in Discover with an error.grouping_key:
{
  "_index": ".ds-logs-apm.error-default-2023.03.08-000001",
  "_id": "KeUqV4cBf1HszhtT7NQz",
  "_version": 1,
  "_score": 0,
  "_source": {
    "agent": { ...  },
    "process": { ... },
    "data_stream.namespace": "default",
    "error": {
      "exception": [
        {
          "code": "ERR_HTTP2_STREAM_CANCEL",
          "stacktrace": [ ... ],
          "handled": false,
          "message": "The pending stream has been canceled (caused by: Client network socket disconnected before secure TLS connection was established)",
          "type": "Error"
        }
      ],
      "culprit": "NodeError (node:internal/errors)",
      "id": "b6eca2d1e124232f9346c5e2ee9c4048",
      "grouping_key": "e173a2eb24aa85deda9f4896ec43ff92",
      "grouping_name": "The pending stream has been canceled (caused by: Client network socket disconnected before secure TLS connection was established)"
    },
    "message": "The pending stream has been canceled (caused by: Client network socket disconnected before secure TLS connection was established)",
    "processor": {
      "name": "error",
      "event": "error"
    },
    "data_stream.type": "logs",
...
  1. That "_index": ".ds-logs-apm.error-default-2023.03.08-000001" corresponds to the "logs-apm.error-default" datastream (GET /_data_stream/logs-apm.error-default in the Dev Tools > Console). In my deployment, this is the mapping for error.grouping_key:
GET /logs-apm.error-default/_mapping/field/error.grouping_key

{
  ".ds-logs-apm.error-default-2023.03.08-000001": {
    "mappings": {
      "error.grouping_key": {
        "full_name": "error.grouping_key",
        "mapping": {
          "grouping_key": {
            "type": "keyword",
            "ignore_above": 1024
          }
        }
      }
    }
  }
}

(My full data stream mapping is here in case it helps: `GET /logs-apm.error-default/_mapping` (8.6.2) · GitHub)

Can you run GET /logs-apm.error-default/_mapping/field/error.grouping_key and/or GET /logs-apm.error-default/_mapping in your deployment's Dev Tools > Console so we can compare mappings? (This may be exactly what you already posted.)

The "type": "text" implies that Elasticsearch attempted to infer the type of that field -- according to Dynamic field mapping | Elasticsearch Guide [8.7] | Elastic rules. However, my understanding is that this should not happen for this data stream, so something funky would have happened.

Really appreciate your assistance here, @trentm !

I agree-- this definitely doesn't seem like the kind of thing that should happen. Here is what I got from running those commands (it is what I already posted but it's good to double check):

GET /logs-apm.error-default/_mapping/field/error.grouping_key
{
  ".ds-logs-apm.error-default-2023.03.25-000002": {
    "mappings": {
      "error.grouping_key": {
        "full_name": "error.grouping_key",
        "mapping": {
          "grouping_key": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

Just posting a update here in case anyone else finds this and is experiencing a similar issue: It turned out that I had created a custom index template for Logstash logs that was being used by my APM (I think the index pattern I used was logs-* which was too generic). Changing that fixed my issue.

1 Like

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.