Shared file system repository and Roles

Hi

I have a cold node with the appropriate roles, but it does not include the data role.

My goal is to set up a Shared File System repository. I have a dedicated disk attached to the cold node, but I cannot create the repository from Kibana. From what I’ve read, the data role is required for this.

My concern is that if I assign the data role to this node, agents might start writing directly to it which I want to avoid. My goal is for data to be written first to the hot tier and then moved to the cold tier after a certain number of days.

What would you recommend in this case?

cold roles from yml: node.roles: [master, voting_only, data_cold]

You don’t need the generic data role , data_cold already works as a data role.
Make sure to:
• Set index templates with _tier_preference: data_hot,… so new writes go to hot.
• Use ILM to move data to cold later.
• Keep the cold node out of your ingest/load balancer so agents never write to it directly.
• For an fs repo, the snapshot path must be mounted and allowed on all master/data nodes. If that’s not possible, use an object store repo (S3, GCS, Azure).

Thank you for the attention, but it is not solution.

My problem is specifically that I need to configure the separate disk attached to the cold node as a repository. All of my lifecycle policies and related configurations are already functioning correctly.

The documentation states:

“First mount the file system to the same location on all master and data nodes. Then add the file system’s path or parent directory to the path.repo setting in elasticsearch.yml for each master and data node. For running clusters, this requires a rolling restart of each node.”

In my setup, the cold node has the roles master and data_cold. However, I believe it also requires a separate data role, which I already described earlier and that would introduce the issue I mentioned

It does not require the data role, the documentation does not mention this, it mention that every master and data node needs to have access to the shared path, a node with the data_cold role is a data node.

Also, if you want to use a file system repository, you need this to be a shared file system, all master and data nodes in the cluster need to have read and write access to it, you can use the attached disk to the cold node I think, but you need to export this as a shared file system to all other nodes and use the shared path to configure the repository in elasticsearch.yml.

Have you did it already?

1 Like

Got it the blocker isn’t ILM or roles; it’s the fs repository requirementE

data_cold already is a data role. You do not need to add the generic data role.

However, an fs snapshot repo cannot live on a disk seen by only one node. By design, the repo path must be mounted at the same location on every master and data node and whitelisted via path.repo on each. This is because data nodes stream shard files directly to the repo.

What will work

Share that disk via NFS/SMB (or another shared FS), mount it on all master+data nodes at the same path (e.g., /mnt/es_snapshots), set:

path.repo: ["/mnt/es_snapshots"]

on every node, do a rolling restart, then:

PUT _snapshot/my_fs_repo
{
  "type": "fs",
  "settings": { "location": "cluster-a", "compress": true }
}
POST _snapshot/my_fs_repo/_verify

Your cold node can still stay master, data_cold and won’t receive new writes as long as your templates use _tier_preference: data_hot,… and the cold node isn’t behind your ingest/LB.

1 Like

Thanks very much