ElasticSearch term Aggregation on text fields

Dharmendra_Arya · May 28, 2020, 12:01pm

I have a requirement to do find out TOP 10 keywords from text data ( short strings only... limited to 5 to 6 keywords).

in order to do aggregate the tokens from text data -
There are two approach I think of :

enable fielddata to true which is set to false by default on text field. What is actual side effect of it when we have millions of logs entry in one index?
Versus , another approach was thinking to get the keywords after applying tokenization outside ES layer and store these tokenized words in ES as datatype "keyword".

Wanted to know if anybody has faced similar challenge earlier and how to go about it if there are any recommendation?

Any tools to validate / measure the JVM memory usage by approach #1 ?

system · June 25, 2020, 12:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Terms aggregation on text Elasticsearch	1	897	July 5, 2017
Mapping: Array Type vs Text Type(with fielddata set to true) Elasticsearch	2	922	March 3, 2017
Why do aggregation queries on keyword field produce fieldata in JVM HEAP? Elasticsearch	2	405	August 10, 2018
Terms aggregation on analyzed field Elasticsearch	1	388	July 5, 2017
Fielddata error indexing string fields Elasticsearch	3	966	December 14, 2021