Mapping: Array Type vs Text Type(with fielddata set to true)

I've a use case where I want to use text field in aggregations. By default text fields are not aggregatable and to make them aggregatable we need to enable fielddata parameter in the mapping.
As enabling fielddata comes with cost of HEAP, I'm wondering if following workaround makes any sense!

  1. Pre-Analyse( tokenize text field) and make array out of it and index it as Array Type
  2. After indexing text field as array type(into field called "text_array") I'll get two fields one is text_array (Text Type) and another one is text_array.keyword(Keyword Type)
  3. As text_array.keyword is type Keyword I can use this in aggregations.

Is creating text_array.keyword makes sense? or does it also consumes HEAP when it is used in aggregations?

Any help on this much appreciated!

Hi,

By default, the doc_values setting is true In keyword datatype field .
So it doesn't consume so many heap at aggregation.

On the other hand, text type field may not necessary (it depends on your use case)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.