Fielddata memory usage

Hi!

I've a v7.2.1 ES cluster and trying to lower heap memory usage. When I checked GET /_cat/fielddata I found a lot of non-text type fields in the list (e.g., "foo.keyword", "bar_ip"). As far as I understood from the relevant documentation, those shouldn't appear in that list nor occupy heap memory. Any idea what am I missing?

Thanks!

Hello @YvorL,

The GET _cat/fielddata shows not only the fielddata, (which as you've said should be used only by text fields), but also global ordinals.

Our documentation provides a quite good explanation on what they are and why they are generated.

A really great answer has been provided on a similar question: Global ordinals performance and size on-heap

If you're executing aggregations or sorting on keyword fields with high cardinality (e.g. a field which represents a unique id or, for example, the _id of the document), the global ordinals are generated and as they're expensive to generate, they're cached indefinitely (by default).

Another collateral case where global ordinals are being used is when you're using Kibana KQL and you are using the auto-complete on the field _id. Kibana will trigger behind the scenes an aggregation on such field.

If the global ordinals have been loaded by mistake (a bad aggregation or a bad query), you can clean the cache using POST /<index name>/_cache/clear?fielddata=true. See here about the clear cache API.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.