Terms aggregation on field with types in different mapping

So I have a bunch of indices, rotated monthly, with two mappings - server, client. They both have a field user_id. In server, the user_id is a string, and in client, it is mostly a string, but in one months index, it got indexed as an integer.

When I try to do a terms aggregation on the user_id field, I get an error - ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket cannot be cast to org.elasticsearch.search.aggregations.bucket.terms.LongTerms$Bucket]

When I query the monthly indices which have the correct mapping, I don't get the error.

This is fine, and expected.

But what I don't understand is why do I get this error even when I specify a mapping?

That is, /events-*/server/_search, and then do a terms aggregation on the user_id field.

StringTerms ,LongTerms ,field type different and confilts?

Hi @elssar,

Any chance you allow dynamic field mapping, the user id was not in the mapping and encountered for the first time?

Anyway, you can correct the problem by creating a new index with the correct mapping and using the reindex API to reindex the data. See index aliases and zero downtime on how to achieve this without affecting your users.

Daniel

Hi @danielmitterdorfer,

I understand that I have to reindex (I've reindexed more times than I'd like to admit :sweat_smile:).

I think my last post didn't explain my query well enough.

What I don't understand is why would a field having conflicting types in two different mappings cause problems when only aggregating over one mapping.

That is,

{
  "events": {
    "mappings": {
      "server": {
        "user_id": {
          "full_name": "user_id",
          "mapping": {
            "user_id": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      },
      "client": {
        "user_id": {
          "full_name": "user_id",
          "mapping": {
            "user_id": {
              "type": "long"
            }
          }
        }
      }
    }
  }
}

Now when I run a termsn aggregation on the user_id field, in /events/server, why does the mapping for user_id in client matter? Shouldn't they be independent of each other. I under that I'd get an error if I sent an query to /events.

Hi @elssar,

oh, yes. I totally misunderstood.

when I run a termsn aggregation on the user_id field, in /events/server, why does the mapping for user_id in client matter? Shouldn't they be independent of each other.

Your assumption is not correct. Fields with the same name must have the same mapping (if they are in the same index). For more details see:

Daniel

@danielmitterdorfer ah, that explains it then. Thank you :slight_smile:

Sure, you're welcome. :slight_smile: