Expensive aggregation puts load onto a single data node

Hello team!

I've been trying to test our effective timeouts and crafted a special expensive query for that purpose. customer_id field cardinality is hundreds of thousands – so you can imagine how expensive this aggregation is on a 100TB cluster with 77 indices.

{
  "size": 0,
  "aggs": {
    "counts": {
      "date_histogram": {
        "field": "created_at",
        "calendar_interval": "day"
      },
      "aggs": {
        "by_app": {
          "terms": {
            "field": "customer_id",
            "size": 100
          },
          "aggs": {
            "by_index": {
              "terms": {
                "field": "_index"
              }
            }
          }
        }
      }
    }
  }
}

I was running this with /_search?timeout=20s request expecting Elasticsearch to stop processing this query after 20 seconds.
Instead Elasticsearch kept processing this query for ~40minutes. The pattern of this processing is quite interesting:
In the first stage, for about 10minutes, all nodes of the cluster experienced elevation of the CPU. Later on, only one single data node continued to experience CPU elevation.

This correlates with reported size of the estimated size in bytes for the request circuit breaker: the very same data node continued to have elevated metric value for the duration of the query. Also coordinator node reported elevated value for the duration of the query too.

This is the shortened output (full one is in a hidden block below) of the hot threads API of the data node that remains pegged on CPU:

::: {es-data1}{3SRRwy3qRMG58Pahyxnz6A}{v58TFfWjT86wud3mexAr1A}{10.0.32.31}{10.0.32.31:9300}{cdfhirstw}{aws_availability_zone=us-east-1b, xpack.installed=true, transform.node=true}
   Hot threads at 2022-05-12T16:06:29.958Z, interval=5s, busiestThreads=3, ignoreIdleThreads=true:

   100.0% [cpu=96.7%, other=3.3%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#29]'
     10/10 snapshots sharing following 45 elements
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)

   100.0% [cpu=96.6%, other=3.4%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#44]'
     10/10 snapshots sharing following 45 elements
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)

   100.0% [cpu=96.5%, other=3.5%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#9]'
     7/10 snapshots sharing following 46 elements
       app//org.elasticsearch.common.util.LongLongHash.getKey1(LongLongHash.java:55)
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)
Full hot threads output
::: {es-data1}{3SRRwy3qRMG58Pahyxnz6A}{v58TFfWjT86wud3mexAr1A}{10.0.32.31}{10.0.32.31:9300}{cdfhirstw}{aws_availability_zone=us-east-1b, xpack.installed=true, transform.node=true}
   Hot threads at 2022-05-12T16:06:29.958Z, interval=5s, busiestThreads=3, ignoreIdleThreads=true:

   100.0% [cpu=96.7%, other=3.3%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#29]'
     10/10 snapshots sharing following 45 elements
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:237)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$900(GlobalOrdinalsStringTermsAggregator.java:54)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:761)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:711)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:626)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildAggregationsForVariableBuckets(BucketsAggregator.java:350)
       app//org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramAggregator.buildAggregations(DateHistogramAggregator.java:300)
       app//org.elasticsearch.search.aggregations.Aggregator.buildTopLevel(Aggregator.java:154)
       app//org.elasticsearch.search.aggregations.AggregationPhase.execute(AggregationPhase.java:67)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:104)
       app//org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$26(IndicesService.java:1523)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6433/0x0000000801bc7828.accept(Unknown Source)
       app//org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$27(IndicesService.java:1589)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6434/0x0000000801bc8000.get(Unknown Source)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:178)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:161)
       app//org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:419)
       app//org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:124)
       app//org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1595)
       app//org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1517)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:456)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6245/0x0000000801b7d818.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6248/0x0000000801b7de90.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6249/0x0000000801b7e508.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.1/java.lang.Thread.run(Thread.java:833)

   100.0% [cpu=96.6%, other=3.4%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#44]'
     10/10 snapshots sharing following 45 elements
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:237)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$900(GlobalOrdinalsStringTermsAggregator.java:54)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:761)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:711)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:626)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildAggregationsForVariableBuckets(BucketsAggregator.java:350)
       app//org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramAggregator.buildAggregations(DateHistogramAggregator.java:300)
       app//org.elasticsearch.search.aggregations.Aggregator.buildTopLevel(Aggregator.java:154)
       app//org.elasticsearch.search.aggregations.AggregationPhase.execute(AggregationPhase.java:67)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:104)
       app//org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$26(IndicesService.java:1523)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6433/0x0000000801bc7828.accept(Unknown Source)
       app//org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$27(IndicesService.java:1589)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6434/0x0000000801bc8000.get(Unknown Source)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:178)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:161)
       app//org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:419)
       app//org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:124)
       app//org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1595)
       app//org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1517)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:456)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6245/0x0000000801b7d818.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6248/0x0000000801b7de90.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6249/0x0000000801b7e508.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.1/java.lang.Thread.run(Thread.java:833)

   100.0% [cpu=96.5%, other=3.5%] (5s out of 5s) cpu usage by thread 'elasticsearch[es-data1][search][T#9]'
     7/10 snapshots sharing following 46 elements
       app//org.elasticsearch.common.util.LongLongHash.getKey1(LongLongHash.java:55)
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:237)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$900(GlobalOrdinalsStringTermsAggregator.java:54)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:761)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:711)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:626)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildAggregationsForVariableBuckets(BucketsAggregator.java:350)
       app//org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramAggregator.buildAggregations(DateHistogramAggregator.java:300)
       app//org.elasticsearch.search.aggregations.Aggregator.buildTopLevel(Aggregator.java:154)
       app//org.elasticsearch.search.aggregations.AggregationPhase.execute(AggregationPhase.java:67)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:104)
       app//org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$26(IndicesService.java:1523)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6433/0x0000000801bc7828.accept(Unknown Source)
       app//org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$27(IndicesService.java:1589)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6434/0x0000000801bc8000.get(Unknown Source)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:178)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:161)
       app//org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:419)
       app//org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:124)
       app//org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1595)
       app//org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1517)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:456)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6245/0x0000000801b7d818.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6248/0x0000000801b7de90.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6249/0x0000000801b7e508.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.1/java.lang.Thread.run(Thread.java:833)
     3/10 snapshots sharing following 45 elements
       app//org.elasticsearch.search.aggregations.bucket.terms.LongKeyedBucketOrds$FromMany$1.next(LongKeyedBucketOrds.java:302)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$RemapGlobalOrds.forEach(GlobalOrdinalsStringTermsAggregator.java:557)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:602)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BestBucketsDeferringCollector$2.buildAggregations(BestBucketsDeferringCollector.java:225)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:237)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$900(GlobalOrdinalsStringTermsAggregator.java:54)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:761)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:711)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:626)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.access$200(GlobalOrdinalsStringTermsAggregator.java:575)
       app//org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:182)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:175)
       app//org.elasticsearch.search.aggregations.bucket.BucketsAggregator.buildAggregationsForVariableBuckets(BucketsAggregator.java:350)
       app//org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramAggregator.buildAggregations(DateHistogramAggregator.java:300)
       app//org.elasticsearch.search.aggregations.Aggregator.buildTopLevel(Aggregator.java:154)
       app//org.elasticsearch.search.aggregations.AggregationPhase.execute(AggregationPhase.java:67)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:104)
       app//org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$26(IndicesService.java:1523)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6433/0x0000000801bc7828.accept(Unknown Source)
       app//org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$27(IndicesService.java:1589)
       app//org.elasticsearch.indices.IndicesService$$Lambda$6434/0x0000000801bc8000.get(Unknown Source)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:178)
       app//org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:161)
       app//org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:419)
       app//org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:124)
       app//org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1595)
       app//org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1517)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:456)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6245/0x0000000801b7d818.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6248/0x0000000801b7de90.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6249/0x0000000801b7e508.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.1/java.lang.Thread.run(Thread.java:833)

Our cluster is running Elasticsearch 7.17.0 and configured with the following settings:

PUT /_cluster/settings
{
  "persistent" : {
    "action" : {
      "auto_create_index" : "false"
    },
    "cluster" : {
      "routing" : {
        "allocation" : {
          "cluster_concurrent_rebalance" : "50",
          "node_concurrent_recoveries" : "50",
          "disk" : {
            "watermark" : {
              "low" : "75%",
              "high" : "85%"
            }
          },
          "node_initial_primaries_recoveries" : "186"
        }
      }
    },
    "indices" : {
      "breaker" : {
        "fielddata" : {
          "limit" : "1gb"
        },
        "request" : {
          "limit" : "1gb"
        }
      },
      "recovery" : {
        "max_bytes_per_sec" : "2000mb"
      }
    },
    "search" : {
      "max_buckets" : "30000",
      "default_allow_partial_results" : "false",
      "max_open_scroll_context" : "40000"
    },
    "logger" : {
      "org" : {
        "elasticsearch" : {
          "deprecation" : "OFF"
        }
      }
    }
  },
  "transient" : { }
}

You can see how we allow many aggregation buckets – unfortunately we do need to perform some aggregations over high-cardinality fields in production traffic too. Our request circuit breaker is set to 1Gb but we tried to lower it down to 500mb while the query was running: it didn't help.

Cluster has 10 data nodes and every index with data configured with at least 10 primaries and one replica. Hence, every single node contains at least 2 shards of every given index.
Number of indices per number of primary shards (e.g. 39 indices with 10 primaries, 12 indices with 15 primaries, etc).

[ec2-user@es-data1 ~]$ curl -s localhost:9200/_cat/indices?h=index,pri | grep production | awk '{print $2}' | sort | uniq -c
     39 10
     12 15
      3 24
      3 40

Cluster is very old so there's plenty of segments ~5gb – every node has them. So it is unlikely that only one single data node has a big segment that blocks query cancellation (based on the various info as soon as search crosses segment boundary cancellation/timeout is not effective).

There are several problems here that I'd like to tackle:

  • Why timeout=20s didn't help to make Elasticsearch stop processing this query after 20 seconds?
  • Why only a single data node stays pegged on the CPU for ~40mins while all the other nodes are back to normal relatively quickly?
  • Why lowering circuit breaker settings didn't halt the query?

We noticed this pattern of "hot" nodes in production some time ago. It is not the same node all the time - different nodes become pegged on CPU. However, with this specific query, we are now able to reproduce this issue consistently. And it is always data1 node that becomes hot on CPU.

Our cluster has 3 separate master-eligible nodes, 3 separate coordinator nodes and the rest of the nodes are exclusively data nodes.

Any help with this is highly appreciated so thank you in advance.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.