Hi, everyone:
We have been experiencing poor indexing performance on an Elastiscsearch cluster deployed with Kubernetes. The Kubernetes cluster consists on 4 nodes (dedicated machines running Ubuntu) with 32Gb of RAM and 4c/8t. We deploy 4 Elasticsearch nodes one on each of the Kubernetes nodes.
When performing some benchmarks using Filebeat, we achieved almost twice the indexing speed using only one Elasticsearch node deployed with Docker alone than when using the 4 node Elasticsearch cluster deployed with Kubernetes. That pointed our suspicious to Kubernetes.
After conducting several tests we found out the root cause of the problem. Somehow when Kubernetes launches the Elasticsearch pods it must be setting some JVM/Docker parameters that causes the java function runtime.availableProcessors() to report 1 instead of 8.
Due to this fact, when querying the Elasticsearch cluster stats (_cluster/stats) it reported only 1 available and 1 allocated processor:
{
[...]
"os" : {
"available_processors" : 1,
"allocated_processors" : 1
}
[...]
}
Since Elasticsearch uses available_processors to allocate the number of threads for the internal thread pools (_nodes/thread_pool) the pools were configured with only one thread, which caused the poor indexing performance.
We managed to partially workaround this by setting a limit of 6 processors to each of the Elasticsearch nodes in the Kubernetes deployment. When the limit is stablished, both available_processors and allocated_processors are set to 6.
We would like to know if there is any way to configure Elasticsearch (elasticsearch.yml or java_opts) without setting a limit on Kubernetes.
Is it possible ?
Thanks in advance,
Rodrigo