Elastic Search causing major page memory errors on Azure Kubernetes (AKS)

DavidDean · January 23, 2024, 9:28pm

ES version: 7.17.5.1
Hosting: Azure Kubernetes (AKS)
Kubernetes: 1.24.9
Virtual machine: D8ads v5 (8 vCPUs, 32 GB RAM)
Virtual machine OS: Ubuntu 18.04.6 LTS

Prometheus is reporting very high rates of major memory page faults on the virtual machine running Elastic Search.

However there is no SWAP being used, the max_map_count is correct (262144) and disk write ahead value seems to be correct at 128.

In terms of resource utilisation, ES is only using a maximum, of 4 vCPU cores of a total available of 8, and the virtual machine is only using 19Gb of a total 32Gb.

The disks are busy but don't seem overloaded:

...so I cannot understand why there are so many major page faults being reported?

I raised this with Microsoft Support who said this is an application issue and that I should raise it with Elastic Search. So here I am!

I wasn't sure whether the root cause was the memory mapped storage type used by Elastic Search, which seems to be unique vs other applications?

Would really appreciate some help from the gurus on here!

DavidDean · January 23, 2024, 9:29pm

Worker node resource utilisation:

DavidDean · January 23, 2024, 9:35pm

We are using PVC disks of type StandardSSD_LRS - here is the specification:

system · February 20, 2024, 9:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.