Elasticsearch 6.8.12 os.cpu always 100 although load varies and stays below number of processors

I have a small elastic cluster running on Kubernetes 1.18 in AKS. It has 3 ingest nodes, 3 masters and 4 data nodes. I do not have much data, only a few gigabytes spread over a dozen indices. It has been setup using Helm 3 and elastic chart 6.8.12 [0]

When I run GET /_cat/nodes?v&s=load_1m:desc, you can see all my data nodes report 100% os.cpu and the load vary between 2-4.

ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.244.15.72           48         100 100    3.52    3.06     2.65 di        -      elasticsearch-test-data-2
10.244.0.237           59         100 100    3.21    3.03     2.87 di        -      elasticsearch-test-data-0
10.244.1.18            48          99 100    3.17    3.22     3.32 di        -      elasticsearch-test-data-1
10.244.17.9            69         100 100    2.72    2.67     3.02 di        -      elasticsearch-test-data-3
10.244.17.8            55          66   3    2.72    2.67     3.02 mi        *      elasticsearch-test-master-2
10.244.17.5            37          84  26    2.72    2.67     3.02 i         -      elasticsearch-test-client-2
10.244.18.42           41          81  28    0.62    0.70     0.63 i         -      elasticsearch-test-client-0
10.244.18.10           66          64   1    0.62    0.70     0.63 mi        -      elasticsearch-test-master-0
10.244.19.12           31          79  26    0.48    0.30     0.35 i         -      elasticsearch-test-client-1
10.244.19.51           18          64   1    0.48    0.30     0.35 mi        -      elasticsearch-test-master-1

I have confirmed the load values are likely correct in numerous ways by:

  • executing top inside the pods for the data nodes and confirmed they match the load values and that the java process is pretty much the same

  • executing top on the pods for the data nodes confirming they are pretty much the same

  • checking application insights reports on how much cpu my data nodes are using

I have configured Kubernetes resources to request 6 cpu and 4Gi RAM. Resource limits are 8 cpu and 4Gi RAM. I run on A8 machines with 8 cpus and I have confirmed the data nodes are each on its own kubernetes node with full access to the cpus.

I confirmed the os view of the cpus by checking /proc/cpuinfo. I have also confirmed that the data nodes each have 6 processors configured. I have monitored my data nodes and seen them go between 2 and 7 cpu usage based on load over this past week.

So, all my information points in the direction that the os.cpu number is incorrect. When checking /_cluster/stats/nodes/<node>, I get the process.cpu value, and it corresponds very well with the load values and the process cpu value from inside the pod. When checking /_nodes/<node>/stats I get both the os.cpu value (100) and the process.cpu value (looks fine).

When looking at hot threads or the thread pool, I see nothing that bothers me. Of course there are some hot threads on some of the nodes but nothing blocking or locking for any significant amount of time.

Is it perhaps a bug in elastic reporting? Am I having issues with the OS view of the VM while running in Kubernetes? Can it be troublesome that my data nodes also are ingest nodes? How can I continue troubleshooting and hopefully at some point sleep well? Am I just reading this cpu value in the wrong way?

[0] https://github.com/elastic/helm-charts/blob/6.8.12/elasticsearch/README.md

The os.cpu number is just 100× whatever the JDK's OperatingSystemMXBean.html#getSystemCpuLoad method reports. What that really means depends on which JDK you're using; in JDK14 at least it works differently in containers from without:

In turn these numbers are read fairly directly from /proc/self/...:

Older JDKs were not so aware of containerised environments, so maybe you need a newer JDK to get accurate figures here?

This is very interesting indeed! Thanks for your input.

We are running version 14 on the nodes:

"version" : "14.0.1",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "14.0.1+7",
"vm_vendor" : "AdoptOpenJDK",

As for the information under /proc/self/... I will have to do some more digging to connect the dots. I have not looked at this information before and need to learn more to understand what I am looking at.

Ok, things have moved slightly between your JDK and the version to which I linked above, the relevant code is here:

(not sure it's very different but it's probably best to look at the right version)

Wow, thanks. That made it much easier.

I parsed the stats manually and did the calcultation by hand and it always comes up above 2. Which according to the code is then set to be 1 -> 100%.

It seems to me that calculation is off. It does not respect the quota. It checks for quota > 0 but then does not use it. In my case, the quota reported is 800000. With the period being 100000 that leaves 8 cpus. If I add my 8 cpu's into the elapsedNanos my cpu percentage comes back at 0.25-0.3 as expected.

I verified this number with a cpu usage script I've used in the past [0].

So there seems to be a mismatch on what the JDK expects the system to report and what is reported. Any thoughts? Bug in OS or bug in JDK?

Os info:

"name" : "Linux",
"pretty_name" : "CentOS Linux 7 (Core)",
"arch" : "amd64",
"version" : "5.4.0-1034-azure",

[0] https://gist.github.com/pcolby/6558833

As it's outside Elasticsearch, your guess is as good as mine I'm afraid. It could also be something specific to AKS, since I imagine this is using a somewhat customised OS too.

1 Like