I've a v7.2.1 ES cluster and trying to lower heap memory usage. When I checked
GET /_cat/fielddata I found a lot of non-text type fields in the list (e.g., "foo.keyword", "bar_ip"). As far as I understood from the relevant documentation, those shouldn't appear in that list nor occupy heap memory. Any idea what am I missing?
GET _cat/fielddata shows not only the
fielddata, (which as you've said should be used only by
text fields), but also
Our documentation provides a quite good explanation on what they are and why they are generated.
A really great answer has been provided on a similar question: Global ordinals performance and size on-heap
If you're executing aggregations or sorting on
keyword fields with high cardinality (e.g. a field which represents a unique id or, for example, the
_id of the document), the global ordinals are generated and as they're expensive to generate, they're cached indefinitely (by default).
Another collateral case where
global ordinals are being used is when you're using Kibana KQL and you are using the auto-complete on the field
_id. Kibana will trigger behind the scenes an aggregation on such field.
global ordinals have been loaded by mistake (a bad aggregation or a bad query), you can clean the cache using
POST /<index name>/_cache/clear?fielddata=true. See here about the clear cache API.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.