ECK operated elasticsearch cluster shows incorrect thread pool counts

coezdemir · August 9, 2024, 2:26pm

Hello,

We have a product searching service, we are using it to let our clients search on our products. Our two datacenters each has a elasticsearch cluster on them, deployed with using Elasticsearch Kind yaml applied to the k8s + the ECK operator.

On our Elasticsearch yaml files we are requesting 3 PODs

count: 3 # We need 3 nodes with combined Master and Data capabilities

all are master + data.
We are also requesting 8 CPU + 8 RAM and limit 24 CPU

resources:
            requests:
              memory: 8Gi
              cpu: 8
            limits:  
              memory: 8Gi
              cpu: 24

On certain peak usage times we do see TOO MANY REQUESTS logs from client services. So I went into checking the Grafana dashboards and the clusters _cat/thread_pool endpoint.

On both I saw search thread pool seems to cap at 37, which is very weird because most of the time our PODs are requesting not even 8 CPUs. (Its visible on our POD monitoring)

I would expect the search threads to show 13 threads since we have 8 CPUs as requested.

My question is, is this normal? Is it possible that ECK deployed clusters get the "fixed" value for search thread pool from limits rather then requests? and If thats the case how would the elastic data nodes can keep up if they have requested 8 CPU and a thread pool of 37.

I will also paste a part of _cluster/settings

"thread_pool" : {
      "force_merge" : {
        "queue_size" : "-1",
        "size" : "3"
      },
      "search_coordination" : {
        "queue_size" : "1000",
        "size" : "12"
      },
      "snapshot_meta" : {
        "core" : "1",
        "max" : "50",
        "keep_alive" : "30s"
      },
      "fetch_shard_started" : {
        "core" : "1",
        "max" : "48",
        "keep_alive" : "5m"
      },
      "estimated_time_interval.warn_threshold" : "5s",
      "scheduler" : {
        "warn_threshold" : "5s"
      },
      "cluster_coordination" : {
        "queue_size" : "-1",
        "size" : "1"
      },
      "search" : {
        "queue_size" : "1000",
        "size" : "37"
      }

Regards.

coezdemir · August 23, 2024, 7:03am

Hello,

Sorry for uping this question, but I really would like to have an opinion before breaking down our production clusters.

Regards.

Topic		Replies	Views
ElasticSearch threadpool and thread count Elasticsearch	1	357	July 6, 2017
Problem controlling searching thread-pool Elasticsearch	5	403	July 6, 2017
How to find the thread pool size of an ElasticSearch cluster? Elasticsearch	5	495	July 6, 2017
Concurrent search request to elasticsearch Elasticsearch	7	23450	July 6, 2017
Question about threadpools Elasticsearch	14	1117	April 30, 2021

ECK operated elasticsearch cluster shows incorrect thread pool counts

Related topics