How to prevent terms aggregation from causing OOM on data nodes

stephen.palfreyman · July 17, 2019, 9:44am

We have experienced a similar issue to that described in this thread whereby one of our users tries to create a huge data table in a visualization in Kibana. This table has 2 buckets which result in 180,000 discrete values, which, perhaps unsurprisingly, cause serious memory issues in the cluster to the point some data nodes go OOM and drop out of the cluster. I already know that it isn't an appropriate usage of a data table, and we've worked around the issue by using a composite aggregation instead. However, I would like to know if there are any settings in elasticsearch which would protect elasticsearch data nodes from going OOM in this circumstance. We have hundreds of users and we cannot educate them all and ensure that they won't run crazy queries like this!
Thanks!

Mark_Harwood · July 17, 2019, 9:59am

Hi Stephen,
What version are you using?
There's a cluster setting to limit the number of buckets produced by any aggregation.
Since version 7.0 this has been defaulted to 10000 but prior to that you would have to set that manually.
Also, version 7.0 of elasticsearch introduced the real memory circuit breaker

stephen.palfreyman · July 17, 2019, 10:07am

Thanks Mark,
That's really useful information - we are on 6.8 but looking to upgrade to 7.2 in the near future, I will look at implementing these as appropriate.
Thanks,
Steve

Mark_Harwood · July 17, 2019, 10:15am

Also of note is the new dataframe functionality. Depending on what your users are doing with aggs this may be useful. Users often use aggs at query-time to join related data that is perhaps best joined at index-time for better analysis.

system · August 14, 2019, 10:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregations blowing up client node (OOM) Elasticsearch	10	3351	July 5, 2017
Frequent OutOfMemory crashes Elasticsearch	2	2397	August 4, 2017
ElasticSearch Output Disconnects and OOM Logstash	1	446	February 4, 2019
OOM for ES: fielddata.cache.size and breaker.fielddata.limit doesn't work Elasticsearch	6	488	July 13, 2018
Elasticsearch Cleint Nodes OOM Killed by Gargantuan Query Elasticsearch	3	668	June 27, 2019

How to prevent terms aggregation from causing OOM on data nodes

Related topics