Elasticsearch - Snapshot having unassigned shards and Repository verification exception

Nikesh · January 10, 2019, 1:54pm

Hi,
I am working on backing up my indexes present in my cluster. I have a dedicated master node and two data nodes. My configuration file in all nodes contains " path.repo: ["/u01/backup"] ".
Firstly, I created a repository using the API :
PUT - http://10.50.1.102:9999/_snapshot/firstbackup

{	"indices": "index1",
    "type": "fs",   "settings": { "location": "backup",  "compress": true   }  }

For which my response was :

  {"error": {"root_cause": [ {"type": "repository_verification_exception",
                    "reason": "[firstbackup] [[fjUpFsbyRrSeN4I18Tmewg, 'RemoteTransportException[[search_slave2][10.50.1.100:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[firstbackup] a file written by master to the store [/u01/backup/backup] cannot be accessed on the node [{search_slave2}{fjUpFsbyRrSeN4I18Tmewg}{wQHn0uN6QdOppX2yE5GQlQ}{10.50.1.100}{10.50.1.100:9300}{ml.machine_memory=16656232448, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]. This might indicate that the store [/u01/backup/backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [vmNm9LzsSpGo_-mYZtPe5w, 'RemoteTransportException[[search_slave1][10.50.1.101:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[firstbackup] a file written by master to the store [/u01/backup/backup] cannot be accessed on the node [{search_slave1}{vmNm9LzsSpGo_-mYZtPe5w}{loRHnkpNTBGJPb4Xsx_vrQ}{10.50.1.101}{10.50.1.101:9300}{ml.machine_memory=16656236544, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]. This might indicate that the store [/u01/backup/backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"}   ],
            "type": "repository_verification_exception",
            "reason": "[firstbackup] [[fjUpFsbyRrSeN4I18Tmewg, 'RemoteTransportException[[search_slave2][10.50.1.100:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[firstbackup] a file written by master to the store [/u01/backup/backup] cannot be accessed on the node [{search_slave2}{fjUpFsbyRrSeN4I18Tmewg}{wQHn0uN6QdOppX2yE5GQlQ}{10.50.1.100}{10.50.1.100:9300}{ml.machine_memory=16656232448, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]. This might indicate that the store [/u01/backup/backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [vmNm9LzsSpGo_-mYZtPe5w, 'RemoteTransportException[[search_slave1][10.50.1.101:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[firstbackup] a file written by master to the store [/u01/backup/backup] cannot be accessed on the node [{search_slave1}{vmNm9LzsSpGo_-mYZtPe5w}{loRHnkpNTBGJPb4Xsx_vrQ}{10.50.1.101}{10.50.1.101:9300}{ml.machine_memory=16656236544, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]. This might indicate that the store [/u01/backup/backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"  },    "status": 500 }

Although it threw an exception here, i could see a new file generated at the specified location.
Next, i created a new snapshot using the API : PUT http://10.50.1.102:9999/_snapshot/firstbackup/snapshot_1?wait_for_completion=true

{ "indices": "index1",
  "ignore_unavailable": true,
  "include_global_state": false  }

The response for this is :

  {  "snapshot": {     "snapshot": "snapshot_1",  "uuid": "B60-2eQbRPC42B3IczSrfA", on_id": 6040099, "version": "6.4.0",  "indices": [  "index1"  ],  "include_global_state": false,  "state": "SUCCESS",
            "start_time": "2019-01-10T13:38:47.963Z",
            "start_time_in_millis": 1547127527963,
            "end_time": "2019-01-10T13:38:48.007Z",
            "end_time_in_millis": 1547127528007,
            "duration_in_millis": 44,
            "failures": [],   "shards": {  "total": 5,  "failed": 0,"successful": 5 }} }

Now, I delete the index from the cluster.
Next, I try to restore the deleted index using the API: POST - http://10.50.1.102:9999/_snapshot/firstbackup/snapshot_1/_restore for which i received the response

{   "accepted": true }

Now, My cluster health shows red status as both primary and replica shards of this index are unassigned
When I run the API- http://10.50.1.102:9999/_cluster/allocation/explain?pretty

{ "index": "index1",
    "shard": 4,
    "primary": false,
    "current_state": "unassigned",
    "unassigned_info": {
        "reason": "NEW_INDEX_RESTORED",
        "at": "2019-01-10T13:12:06.536Z",
        "details": "restore_source[firstbackup/snapshot_1]",
        "last_allocation_status": "no_attempt"   },
    "can_allocate": "no",
    "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
    "node_allocation_decisions": [   {
            "node_id": "fjUpFsbyRrSeN4I18Tmewg",
            "node_name": "search_slave2",
            "transport_address": "10.50.1.120:9300",
            "node_attributes": {
                "ml.machine_memory": "16656232448",
                "ml.max_open_jobs": "20",
                "xpack.installed": "true",
                "ml.enabled": "true"     },
            "node_decision": "no",
            "deciders": [{
                    "decider": "replica_after_primary_active",
                    "decision": "NO",
                    "explanation": "primary shard for this replica is not yet active",
                { "decider": "throttling",
                    "decision": "NO",
                    "explanation": "primary shard for this replica is not yet active" }  },
        { "node_id": "vmNm9LzsSpGo_-mYZtPe5w",
            "node_name": "search_slave1",
            "transport_address": "10.50.1.121:9300",
            "node_attributes": {
                "ml.machine_memory": "16656236544",
                "ml.max_open_jobs": "20",
                "xpack.installed": "true",
                "ml.enabled": "true" },
            "node_decision": "no",
            "deciders": [ {
                    "decider": "replica_after_primary_active",
                    "decision": "NO",
                    "explanation": "primary shard for this replica is not yet active"    },
                { "decider": "throttling",
                    "decision": "NO",
                    "explanation": "primary shard for this replica is not yet active" }  ] }  ] }

Am I missing something here? Is there a configuration that I have missed?

Christian_Dahlqvist · January 13, 2019, 12:31pm

Does the repository path point to a shared filesystem that is accessible by all nodes? Note that this can not point to paths in the local file systems.

system · February 10, 2019, 12:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Repository verification exception Elasticsearch	9	11333	July 27, 2017
500 - Internal Server : ErrorRepository_verification_exception, nested: RepositoryVerificationException[[backup] a file written by master to the store [C:\\Updates\\Backups\\ElasticSearch] cannot be accessed on the node Elasticsearch	10	524	September 21, 2022
Repository_verification_exception Elasticsearch snapshot-and-restore	2	390	July 24, 2023
When I am registering a repository for snapshot, I have kept the path. repo at all my master and data nodes again am unable to get the repository at verifying repository it is showing not connected and this below error. I have no idea what to do further Elasticsearch	4	259	July 28, 2022
Getting error while creating snapshot Elasticsearch	3	5373	July 5, 2017

Elasticsearch - Snapshot having unassigned shards and Repository verification exception

Related topics