Hi,
At our company we have a 6.1 cluster. To prepare for the upgrade to 6.8+ onwards, we've started rolling out new machines to Focal Fossa / Ubuntu 20.04 instead of Ubuntu 18.04.
The cluster consists of 4 data nodes, 3 search nodes, 3 masters. The 4 data nodes are named data-01 / data-02 / data-03 / data-04 and split into two pools via node attributes: odd: pool1 | even: pool2.
The JDK version is the same, latest OpenJDK from same PPA.
The "maybe memleak" triggers on nodes 3 / 4 (pool 1 & 2), both with Focal Fossa (20.04) - but not on nodes 1 / 2 (pool 1 & 2 ) with Bionic Beaver (18.04)
Maybe memleak means: the nodes start utilizing swapspace, but not in an explosive way, but rather an slow and stady pace (~ 2 GB in 12 hours).
All nodes use MMAPFS. From my point of understanding, utilizing the MMAPFS store means...
64 GB of RAM per Node:
- You've heap, which is pre touched (-XX:+AlwaysPreTouch), used by Elasticsearch's Java process (~28 GB) and it's mlocked, cannot be swapped
- You've Lucene's index files on disk mmapped, hence it's outside of the mlocked heap
- Via MMAP, Elasticsearch/Lucene can access the files and read the necessary index data in RAM
Since swap was involved, I utilized smem and started analyzing which segments of memory were used in swap. The 2 GB of swap space belong to elasticsearch and MMAP.
This led to a discussion in our team wether or not this is correct behaviour.
As far as I understand it:
Lucene "indirectly" just gobbles up as much RAM as it can via MMAP and since we currently (due to the migration) are not writing new data, the indexes are not modified - hence they won't be closed and the RAM won't get freed.
It's just a sad coincidence it happens on the Focal Fossa machines.
Is this statement correct? If no RAM is available, Lucene starts utilizing the swap?
(vm.swappiness is 1, vm.overcommit_memory 0 since I think that limiting the RAM would lead to exceptions and a dead elasticsearch process).
The only alternative would be to switch to NIOFS, since 6.1 doesn't have hybrid fs (as in 6.8, backported from 7).
NIOFS could be an in drop replacement, since we don't change via index.store -> niofs the files on disk, rather change the way the files are accessed, correct?
Thanks if someone could clear that up