I am trying to take backup of ES indices in my two nodes cluster. Getting error like : the store [/user/home/esbackup] is not shared between this node [data node] and the master node or that permissions on the store don't allow reading files written by the master node. That I understand.
Now coming to my query, ES documentation says :
Every data and master-eligible node requires access to a data directory where shards and index and cluster metadata will be stored.
Now, If I change the data path in elasticsearch.yml file (node.data) then won't I be required to make new data path as shared fs like NFS, S3, HDFS ? Because in case of backup repo(path.repo), I need to make backup file as shared fs. How is the data repository different from backup repository ?
They are two completely different concepts and have no dependency apart from that they have to be different paths. A nodes data is stored under path.data while any snapshots are stored somewhere under the paths specified by path.repo. The first should be on local storage while second need to use shared storage.
If I use multi nodes cluster, then data stored in path.data is not specific to that node. It will be available to whole cluster and every node can read data from that location.
If the data in path.data is consistent throughout the cluster then can I use something like this : path.data: /var/data/elasticsearch path.repo: /var/data/elasticsearch/backups
No, the node in question manages the data stored here. This is not accessible directly from other nodes. Requests are however served by the cluster as a whole.
I would recommend using two completely different paths.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.