Welcome!
Yes — your storage design is a very plausible root cause.
Short answer:
- Separate folders are not enough if they all sit on the same NAS/shared storage.
- Elasticsearch can use remote storage only if that storage behaves exactly like a local disk from the filesystem’s point of view.
- In practice, direct-attached/local storage is generally preferred and is usually more stable and faster.
- If your NAS introduces latency, locking quirks, transient I/O stalls, or filesystem semantics that differ from local disk, it can absolutely lead to:
- shard instability
- slow recovery
- indexing failures
- transport exceptions
- lock-related problems
- poor dashboard/query performance
Elasticsearch requires the filesystem to act as if it were backed by a local disk … it will work correctly on properly-configured remote block devices (e.g. a SAN) and remote filesystems (e.g. NFS) as long as the remote storage behaves no differently from local storage.
This is the most important point for your case. A NAS / mapped network drive / shared filesystem is not automatically supported just because it is mounted and writable. It must preserve the same semantics and reliability Elasticsearch expects from local disk.
Also directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads. Some remote storage performs very poorly, especially under the kind of load that Elasticsearch imposes.
That aligns strongly with your observation that the node moved to local disk became more stable.
Answers to your specific questions
- Is using NAS/shared storage (even with separate folders per node) supported for Elasticsearch data paths?
It is not “NAS is supported” in a blanket sense. The official position is closer to:
- Remote storage may work
- Only if it behaves exactly like local storage
- You must benchmark and validate it under realistic Elasticsearch load
So if your NAS is exposed as a Windows mapped drive / shared filesystem and has any issues with latency, file locking, caching, metadata operations, or transient disconnects, that can make it unsuitable.
Can shared storage cause shard instability, indexing failures, or errors like AlreadyClosedException?
Yes, it can contribute to or trigger these symptoms.
While AlreadyClosedException is not a message that uniquely proves “NAS is the cause,” unstable or slow storage can absolutely lead to cascading shard problems such as:
- shard closures/reopens
- failed recoveries
- lock acquisition problems
- shard unavailability
- delayed writes / fsync issues
- node instability under load
Those can then surface as higher-level errors like:
UnavailableShardsException
- transport exceptions
- bulk indexing failures
- lock-related failures
- slow Grafana queries/dashboard loads
Your note that one node became more stable on local disk is a strong practical indicator.
Does Elasticsearch require completely isolated local storage per node for stable operation?
Best practice: yes, isolated local storage per node is strongly preferred.
More precisely:
- Each node must have its own data path
- The storage behind that path should ideally be dedicated local/direct-attached storage
- Remote/shared storage is only acceptable if it is proven to behave like local disk
So Elasticsearch does not strictly require “local disk only” in all cases, but for self-managed clusters, local isolated storage is the safest and most common recommendation.
Could our current setup (multiple nodes using different folders on the same NAS) be the reason for these issues?
Yes — very possibly.
Even though each node uses a different folder, all nodes still depend on the same shared NAS backend. That creates several risks:
- shared I/O bottleneck across all nodes
- latency spikes affecting multiple nodes at once
- locking/metadata behavior that differs from local disk
- correlated failures during heavy indexing or recovery
- poor shard recovery performance
- cluster instability when the storage stalls
So the fact that the folders are different does not eliminate the architectural risk, because the underlying storage system is still shared.
Can i get official documentation for the same?
Here are the most relevant official Elastic docs for your case:
HTH