DocumentSubsetReader is holding on to large number of objects

On one of our nodes, NUM_DOCS_CACHE in
elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/accesscontrol/DocumentSubsetReader.java at v8.11.1 · elastic/elasticsearch is holding on to 67,000 entries consuming 5GB of heap. Each entry in the map appears to be another cache which is holding on to small number objects, but because the cache is created with 256 CacheSegments by default, all these are adding up to consume a lot of memory.

I deleted my older reply, I misread your message.

It sounds like you have a lot of segments for shards with Document Level Security enabled.

If the DocumentSubsetDirectoryReader.NUM_DOCS_CACHE has 67,000 entries, then that implies that you have 67,000 segments on that node for an index where users have DLS enabled.

Is that right? Do you really have that many segments?

I didn't check when I took the heap dump but right now each node has around 600 segments. We are having cluster stability issues with nodes going OOM on a regular basis, not sure if this is contributing to the problem.

We do use DLS with templated queries. But 67,000 segments doesn't sound right.

I've never seen this before, and the NUM_DOCS_CACHE should remove entries when segment readers are closed.
If you truly have 67,000 entries in that map, but don't have 67,000 segments, then something is wrong.

Can you open a support case, and include any heap dumps you have?

You can include a link to this thread in your support case so whoever picks it up can reach out to me if they need to.