Nodes master+data share the same volume and path

Hello everyone,

I have deployed in Kubernetes a Statefulset that creates 3 nodes of Elasticsearch of the master+data type, sharing the same volume. Then I saw this warning.

Never run different node types (i.e. master, data) from the same data directory. This can lead to unexpected data loss.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#max-local-storage-nodes

Is there a risk if all nodes are of the same type (master and data)? Because the warning can be interpreted that the nodes cannot be of different type, ie there could not be 2 master nodes and 3 data nodes (master or data), but it could be that there is no risk having all nodes equal, master+data.

Thank you very much in advance.

Kind regards,

Alfonso Ruiz-Bravo

Huh, I've never seen that warning before, and it's not clear why it's there. It is of course perfectly fine to set node.master: true and node.data: true on a node; indeed this is the default configuration.

However, as a general rule you should avoid setting node.max_local_storage_nodes in production, and instead give each node a different data path.

Ok, I see. If you have multiple nodes running in the same data path then they all choose different subfolders in which to keep their data, and there's no guarantee that they will use the same subfolder after a restart. If a master-only node starts to run on a subfolder that previously belonged to a data-only node then it will ignore the index data, but then after a restart might be assigned to a data node again and possibly re-import some stale index data.

Avoid node.max_local_storage_nodes and you should be ok.

1 Like

Thank you so much for your quick response.

I understand the risk, but that risk would be present if all the nodes are of the same type, that is, if all the nodes are master and data?

Kind regards,

Alfonso Ruiz-Bravo

No, if they all have the same roles (i.e. node.master: true and node.data: true) then the situation I described won't be a problem.

Thank you again for your quick response.

In the case that the nodes have the same roles (i.e. node.master: true and node.data: true ), if the nodes were restarted, would there still be a risk that each node would go to a directory that previously did not correspond to it?

I ask because of what you mentioned earlier:

no guarantee that they will use the same subfolder after a restart

I understand that if nodes have the same roles, two things can happen when nodes restart:

A) Each node can go to a subfolder that was not the one it was using, but having the same roles is not a risk.

B) Each node uses the subfolder it was previously using.

Kind regards,

Alfonso Ruiz-Bravo

Yes, but this shouldn't matter. It means that the correspondence between node.name (in the elasticsearch.yml file) and the node ID (stored in the data path) is messed up, but Elasticsearch is supposed to cope with this.

Thank you very much for your answers, you have helped me a lot and in a short time.

I will take into account everything you have said and advised me.

Very good job.

Kind regards,

Alfonso Ruiz-Bravo

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.