Fielddata memory usage

YvorL · April 23, 2020, 9:18pm

Hi!

I've a v7.2.1 ES cluster and trying to lower heap memory usage. When I checked GET /_cat/fielddata I found a lot of non-text type fields in the list (e.g., "foo.keyword", "bar_ip"). As far as I understood from the relevant documentation, those shouldn't appear in that list nor occupy heap memory. Any idea what am I missing?

Thanks!

Luca_Belluccini · April 24, 2020, 12:14am

Hello @YvorL,

The GET _cat/fielddata shows not only the fielddata, (which as you've said should be used only by text fields), but also global ordinals.

Our documentation provides a quite good explanation on what they are and why they are generated.

A really great answer has been provided on a similar question: Global ordinals performance and size on-heap

If you're executing aggregations or sorting on keyword fields with high cardinality (e.g. a field which represents a unique id or, for example, the _id of the document), the global ordinals are generated and as they're expensive to generate, they're cached indefinitely (by default).

Another collateral case where global ordinals are being used is when you're using Kibana KQL and you are using the auto-complete on the field _id. Kibana will trigger behind the scenes an aggregation on such field.

If the global ordinals have been loaded by mistake (a bad aggregation or a bad query), you can clean the cache using POST /<index name>/_cache/clear?fielddata=true. See here about the clear cache API.

system · May 22, 2020, 12:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Global ordinals performance and size on-heap Elasticsearch	9	3305	February 6, 2019
Fielddata stats Elasticsearch	2	544	February 13, 2020
Global ordinals on high cardinality fields with large indices Elasticsearch	8	466	May 25, 2020
Fielddata: use or not to use Elasticsearch	4	762	February 14, 2017
Difference between aggregating on analyzed text field (using field data), compared to aggregating on high cardinality non-analyzed field Elasticsearch	4	627	March 16, 2021

Fielddata memory usage

Related topics