The 50/50 allocation split between "max heap" and "all other memory" is a rule of thumb so ES Java objects do not congest because of GC while the ES process and the other operating system resources do not compete for RAM.
So there is no strict partitioning of JVM heap here and file system cache there, you always need both.
Elasticsearch/Lucene is running on the Java Runtime Environment, this allocates JVM heap. The more work a node must execute, the more JVM heap is required to configure.
Some internal Java byte buffers for reading and writing indices are stored off-heap, but they contribute to the process size. These structures can grow and shrink at runtime, it depends mostly on the indexing workload.
All files for read and write, Java Runtime or not, are always in the file system cache. Many files are memory-mapped into the process virtual memory when being read from. This can accelerate file seeks, especially when executing search operations. If there is only few RAM resources available, some file read operations from cache will be slower, but ES still work perfectly. The file read pattern is unpredictable as long as your workload is not well known, so the whole ES cluster resources are assumed to be in use.
If you want to find out your specific balance factor, set up a test system, run your indexing / search workload, and watch JVM heap behavior and RAM resource consumption.