Understanding execution time and memory usage of a terms aggregation?

AJ_2 · September 29, 2015, 12:31pm

Hello to everyone.

The cluster has 4nodes(EC2 r3.2xlarge) with 60GB of memory, 30GB allocated to the Java Heap.
We're doing a terms aggregation on a field that has 20M terms (9M distinct) spread across 19M docutments.
The index has 5 shards and 1 replica. The cluster has 4 nodes.
The query we're performing filters down to 1M documents and executes the aggregation:

 attribute :urls, Array[String], mapping:{index: :not_analyzed, doc_values: true}

{"destination_urls"=>{
   "terms"=>{"field"=>"urls", "size"=>100}
}

We tested the query by doing many in sequence it takes up to 6 seconds and apporiximatley on every 5th query it takes up to 30 seconds. We believe those worse cases happen because the nodes are garbage collecting.

Since we're having a hard time understanding what exactly is going on and how understand what the limits are I'll write down what we thing is happening and would be grateful if someone could confirm we got it right or tell us what we got wrong.

terms agg gets executed on every shard
field_data is read from disk and traversed
for each encountered term a bucket is generated
each doc associated with the term is matched against the filter and if it matches bucket's doc_count count is incremented
a limited number of buckets (100) is returned to the client node performing the query
the client node merges the partial results and computes the final list

Our understanding od disk usage:

Since field data is loaded from disk, disk performance has an important impact on execution time.

Our understanding of memory usage:

Memory is used by loading field data that needs to be traversed, but this is contained.
Memory is used by the number of buckets needed to exist on every shard.
Memory on the client node is used to merge the results.

Memory related assumptions:

By increasing the number of shards field data for each shard will decrease but increase in total.
By increasing the number of shards number of buckets per shard will decrease but increase in total.
Memory on the client to merge the results will stay the same.

Our understanding of CPU usage

CPU has it's impact when sorting the buckets on every shard
CPU has it's impact when merging the results on the client

CPU related assumptions:

By increasing the number of shards, we reduce the sorted sets
We parallelise processing and reduce time overall
We increase time for merging the results, but think this won't have a big impact

Questions:

How would you speed this up?

Topic		Replies	Views
How to find out memory usage while aggregating some fields? Elasticsearch	5	575	April 15, 2020
Terms aggregations in docs with nested objects using a lot of memory Elasticsearch	1	384	July 6, 2017
Slow searches on a cluster Elasticsearch	3	878	July 5, 2017
What's using memory in ElasticSearch? (Details to follow...) Elasticsearch	8	1919	July 6, 2017
Performance memory swapping Windows? Elasticsearch	6	1338	July 6, 2017

Understanding execution time and memory usage of a terms aggregation?

Related topics