Restoring to a different cluster with version 5.2.0

jspooner · February 16, 2017, 12:23am

In the guides for restoring to a different cluster it says

If indices in the original cluster were assigned to particular nodes using shard allocation filtering, the same rules will be enforced in the new cluster. Therefore if the new cluster doesn’t contain nodes with appropriate attributes that a restored index can be allocated on, such index will not be successfully restored unless these index allocation settings are changed during restore operation.

In my situation I have an index that was created from the _shrink API. It's source index requires routing to a specific node so the shrink can happen. After the new index was created it automatically moved off of the node used to shrink. So I believe my new index does not contain any custom routing settings.

After deleting the original index I took a _snapshot. The next day I started up a new Elasticsearch cluster and attempt to _restore. Indices that are not products of the _shrink API restore immediately and the others sit in a red state.

When looking at the Index Metadata I see there is an allocation.initial_recover._id assigned. This is not something you can set in the config and is not the node.name.

{
	"state": "open",
	"settings": {
		"index": {
			"routing": {
				"allocation": {
					"initial_recovery": {
						"_id": "HUbsbDLGRrWwoQtKlXP3Vw"
					}
				}
			},
			"allocation": {
				"max_retries": "1"
			},
			"number_of_shards": "1",
			"shrink": {
				"source": {
					"name": "bulk-sightings-geohex-2016-01-04",
					"uuid": "Y3tQt30zRTKUiQvG39VtyA"
				}
			},
			"provided_name": "sightings-geohex-2016-01-04",
			"creation_date": "1486754454536",
			"number_of_replicas": "1",
			"uuid": "UwlPzI9sQQyRVKUijXF5nQ",
			"version": {
				"created": "5020099"
			}
		}
	},
	"in_sync_allocations": {
		"0": [
			"1HL22qjlTxyi1IllOqACVQ",
			"EZ__d7liTaaTeO2Z6BlJUA"
		]
	}
}

Using _cluster/allocation/explain API we can look at the primary shard.

curl -XGET "$INSTANCE_IP:9200/_cluster/allocation/explain?pretty" -d '{
  "index": "sightings-geohex-2016-01-01",
  "shard": 0,
  "primary": true
}'

Again we get confirmation on our _id problem.

initial allocation of the index is only allowed on nodes [_id:"HUbsbDLGRrWwoQtKlXP3Vw"

{
    "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
    "can_allocate": "no",
    "current_state": "unassigned",
    "index": "sightings-geohex-2016-01-01",
    "node_allocation_decisions": [
        {
            "deciders": [
                {
                    "decider": "filter",
                    "decision": "NO",
                    "explanation": "initial allocation of the index is only allowed on nodes [_id:\"HUbsbDLGRrWwoQtKlXP3Vw\"]"
                }
            ],
            "node_decision": "no",
            "node_id": "2LS9wxZuS72Wt_lbUybEIw",
            "node_name": "2LS9wxZ",
            "transport_address": "10.91.139.169:9300",
            "weight_ranking": 1
        },
        {
            "deciders": [
                {
                    "decider": "filter",
                    "decision": "NO",
                    "explanation": "initial allocation of the index is only allowed on nodes [_id:\"HUbsbDLGRrWwoQtKlXP3Vw\"]"
                }
            ],
            "node_decision": "no",
            "node_id": "ZRduycOURwyyGn1SZrd72Q",
            "node_name": "ZRduycO",
            "transport_address": "10.61.190.175:9300",
            "weight_ranking": 2
        }
    ],
    "primary": true,
    "shard": 0,
    "unassigned_info": {
        "at": "2017-02-15T18:21:43.960Z",
        "details": "restore_source[s3_repository/2017-02-10-17:36:49]",
        "last_allocation_status": "no",
        "reason": "NEW_INDEX_RESTORED"
    }
}

Ok let's review a few claims from the original statement

If indices in the original cluster were assigned to particular nodes using shard allocation filtering, the same rules will be enforced in the new cluster.

The index created from the _shrink API does not have any routing assigned to it.

index will not be successfully restored unless these index allocation settings are changed during restore operation.

Ok, so our index doesn't have a custom _name it has a mysterious _id. Let's try to ignore that _id by adding it to ignore_index_settings and set a node name to something in our cluster. And let's change the number_of _replicas to we know these settings are being assigned.

curl -s -XPOST "$INSTANCE_IP:9200/_snapshot/s3_repository/2017-02-10-17:36:49/_restore?wait_for_completion=true" -d '{
  "indices": "sightings-geohex-2016-01-01",
  "ignore_unavailable": true,
  "include_global_state": false,
  "index_settings": {
    "number_of_replicas":2,
    "index.routing.allocation.require._name": "1wXza4A"
  },
  "ignore_index_settings": [
    "index.routing.allocation.require._id"
  ]
}'

And to my surprise. We have the same result, a red cluster.

So does anyone know what this mystery _id is?

And since _restore?wait_for_completion=true will never return should this exit with an error code vs hanging out forever?

system · March 16, 2017, 12:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Restore cold indices on new cluster Elasticsearch snapshot-and-restore	3	460	December 15, 2021
Issue with allocation filtering on snapshot restore Elasticsearch snapshot-and-restore	11	905	April 7, 2022
Problems with Snapshot Restore Elasticsearch	3	1229	September 18, 2017
Restore backup and shard allocation filtering Elasticsearch	2	444	March 31, 2017
Restoring to a different cluster Elasticsearch snapshot-and-restore	7	229	July 4, 2022

Restoring to a different cluster with version 5.2.0

Related topics