Not sure if elasticsearch is involved, but, when doing a big aggregation (100millions doc, unique count on big cardinality fields, we use doc values), this:
takes 1 minute
consumes 50% CPU
RAM is stable, heap size is stable (no pike at all)
I/O goes max to 30 MB/s
I am wondering if there is something that could limit performance? (we have SSD that should deliver 500MB/s, I am quite surprise).
How many shards are you aggregating across? How many nodes do you have in the cluster?
Queries and aggregations are performed in a single thread per shard, but shards can be processed in parallel. If you therefore have a low number of shards per node that you are aggregating across and issue a single query, you can be limited by CPU even though it is not maxed out on the node.
Is that 21 primary shards? How many CPU cores does each node have?
I would also recommend looking at iostat and see what the disk utilisation an await looks like. Read throughput on its own may not be the best measure as the load typically consists of a lot of random access reads.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.