Remove _id field from having fielddata

I'm currently getting huge fielddata stored in RAM when a user does an aggregation on the _id field. I'd like to prevent that from happening because it breaks the system down if someone does that. Is there any recommendation on how to do that?

Yes, the official documentation says the following about the _id field and aggregation and sorting:

The value of the _id field is also accessible in aggregations or for sorting, but doing so is discouraged as it requires to load a lot of data in memory. In case sorting or aggregating on the _id field is required, it is advised to duplicate the content of the _id field in another field that has doc_values enabled.

So you should copy the value of your _id fields to a new field with doc_values: true in the index mapping.

You may also want to read this interesting blog post to understand fielddata and doc_values.

See also #49166 which will add an option to throw an exception rather than loading fielddata on the _id field.

@Bernt_Rostad The issue is that I'm using the _id field to prevent duplicates.. so it's necessary that I use it. I do have the _id in another field, but the issue is that some users still try to use _id either intentionally or by accident.

@DavidTurner -- That's exactly what I'm looking for. Surprised that it's an option that's only now going into the stack! Thanks :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.