Hello,
We have a product searching service, we are using it to let our clients search on our products. Our two datacenters each has a elasticsearch cluster on them, deployed with using Elasticsearch Kind yaml applied to the k8s + the ECK operator.
On our Elasticsearch yaml files we are requesting 3 PODs
count: 3 # We need 3 nodes with combined Master and Data capabilities
all are master + data.
We are also requesting 8 CPU + 8 RAM and limit 24 CPU
resources:
requests:
memory: 8Gi
cpu: 8
limits:
memory: 8Gi
cpu: 24
On certain peak usage times we do see TOO MANY REQUESTS logs from client services. So I went into checking the Grafana dashboards and the clusters _cat/thread_pool endpoint.
On both I saw search thread pool seems to cap at 37, which is very weird because most of the time our PODs are requesting not even 8 CPUs. (Its visible on our POD monitoring)
I would expect the search threads to show 13 threads since we have 8 CPUs as requested.
My question is, is this normal? Is it possible that ECK deployed clusters get the "fixed" value for search thread pool from limits rather then requests? and If thats the case how would the elastic data nodes can keep up if they have requested 8 CPU and a thread pool of 37.
I will also paste a part of _cluster/settings
"thread_pool" : {
"force_merge" : {
"queue_size" : "-1",
"size" : "3"
},
"search_coordination" : {
"queue_size" : "1000",
"size" : "12"
},
"snapshot_meta" : {
"core" : "1",
"max" : "50",
"keep_alive" : "30s"
},
"fetch_shard_started" : {
"core" : "1",
"max" : "48",
"keep_alive" : "5m"
},
"estimated_time_interval.warn_threshold" : "5s",
"scheduler" : {
"warn_threshold" : "5s"
},
"cluster_coordination" : {
"queue_size" : "-1",
"size" : "1"
},
"search" : {
"queue_size" : "1000",
"size" : "37"
}
Regards.