Terms aggregation + Cardinality Aggregation performance

(Jonathan Adamit) #1

My terms aggregation (on a high cardinality field) with an inner cardinality aggregation takes about 23 seconds to complete (one node, one shard, 300,000 documents, keyword fields).
The equivalent SQL query using SQL server takes a few seconds at most.

Is it reasonable of me to make that comparison? Maybe SQL Server is just more suited for this kind of queries?

I posted a related question with much more detail on Stack Overflow (https://stackoverflow.com/q/44225038/1545350) if anyone cares to assist there as well.

Thanks, Jonathan.

(Mark Harwood) #2

Try reduce the precision_threshold [1] setting to reduce the memory used to calculate these values for each of your many order_id buckets. The default value is 3,000 and I imagine the average number of items per order falls way below this value.

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#_precision_control

(Jonathan Adamit) #3

Thanks! Didn't know it had this kind of effect on memory consumption!
This indeed lowers memory consumption for me below the circuit breaker.
Performance wasn't affected that much by this change, but it does help.

Side note - my comparison with SQL Server was faulted because of caching.

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.