Issue using searchable snapshots in an ILM policy

3266miles · January 23, 2021, 6:47pm

Taking searchable snapshots as part of an ILM policy for a spin, the indexes seem to be moving through the stages fine until it reaches the cold stage, where it takes the snapshot but then utterly fails to mount it.

Running a hot-warm-cold setup on Elastic 7.10.1 on ECK 1.3.1 trial license applied and using GCS as the snapshot backend.

The stack that I'm seeing (for every index that's attempted so far) is:

failing shard [failed shard, shard [restored-shrink-auditbeat-v7-000001][0], node[dQ8FF6kjTN6GXKAg9-36DQ], [P], recovery_source[snapshot recovery [_no_api_] from default:2021.01.23-shrink-auditbeat-v7-000001-pantheon-audit-0goifa_vqsynluhbnlktww/GVVNoA-SSUqyclwjIUMZLg], s[INITIALIZING], a[id=xluTACB4TfWtHUOO7O4lhQ], unassigned_info[[reason=ALLOCATION_FAILED], at[2021-01-23T17:50:26.368Z], failed_attempts[2], failed_nodes[[dQ8FF6kjTN6GXKAg9-36DQ]], delayed=false, details[failed shard on node [dQ8FF6kjTN6GXKAg9-36DQ]: failed to create index, failure IllegalStateException[multiple engine factories provided for [restored-shrink-auditbeat-v7-000001/IYliwBPhQaemofR4hVwMvA]: [org.elasticsearch.snapshots.SourceOnlySnapshotRepository$$Lambda$5291/0x0000000801869d30],[org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots$$Lambda$5292/0x00000008012a5a30]]], allocation_status[fetching_shard_data]], expected_shard_size[14285895102], message [failed to create index], failure [IllegalStateException[multiple engine factories provided for [restored-shrink-auditbeat-v7-000001/IYliwBPhQaemofR4hVwMvA]: [org.elasticsearch.snapshots.SourceOnlySnapshotRepository$$Lambda$5291/0x0000000801869d30],[org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots$$Lambda$5292/0x00000008012a5a30]]], markAsStale [true]]

A cat of GET _cluster/allocation/explain?pretty (reduced down)

{
  "index" : "restored-shrink-auditbeat-v7-000004",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "ALLOCATION_FAILED",
    "at" : "2021-01-23T18:38:48.858Z",
    "failed_allocation_attempts" : 5,
    "details" : "failed shard on node [dQ8FF6kjTN6GXKAg9-36DQ]: failed to create index, failure IllegalStateException[multiple engine factories provided for [restored-shrink-auditbeat-v7-000004/8IaTkA4oQQS9o04-aHoaPA]: [org.elasticsearch.snapshots.SourceOnlySnapshotRepository$$Lambda$5291/0x0000000801869d30],[org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots$$Lambda$5292/0x00000008012a5a30]]",
    "last_allocation_status" : "no"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "dQ8FF6kjTN6GXKAg9-36DQ",
      "node_name" : "oracle-es-cold-1",
      "transport_address" : "10.50.7.3:9300",
      "node_attributes" : {
        "k8s_node_name" : "gke-oracle-standard-9729ac57-414q",
        "xpack.installed" : "true",
        "transform.node" : "false"
      },
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-01-23T18:38:48.858Z], failed_attempts[5], failed_nodes[[dQ8FF6kjTN6GXKAg9-36DQ]], delayed=false, details[failed shard on node [dQ8FF6kjTN6GXKAg9-36DQ]: failed to create index, failure IllegalStateException[multiple engine factories provided for [restored-shrink-auditbeat-v7-000004/8IaTkA4oQQS9o04-aHoaPA]: [org.elasticsearch.snapshots.SourceOnlySnapshotRepository$$Lambda$5291/0x0000000801869d30],[org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots$$Lambda$5292/0x00000008012a5a30]]], allocation_status[deciders_no]]]"
        },
        {
          "decider" : "restore_in_progress",
          "decision" : "NO",
          "explanation" : "shard has failed to be restored from the snapshot [default:2021.01.23-shrink-auditbeat-v7-000004-pantheon-audit-zuekhgcbsrs9ngym0zfjxw/vSp3OhG8TPG5NjFSge3fLw] because of [failed shard on node [dQ8FF6kjTN6GXKAg9-36DQ]: failed to create index, failure IllegalStateException[multiple engine factories provided for [restored-shrink-auditbeat-v7-000004/8IaTkA4oQQS9o04-aHoaPA]: [org.elasticsearch.snapshots.SourceOnlySnapshotRepository$$Lambda$5291/0x0000000801869d30],[org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots$$Lambda$5292/0x00000008012a5a30]]] - manually close or delete the index [restored-shrink-auditbeat-v7-000004] in order to retry to restore the snapshot again or use the reroute API to force the allocation of an empty primary shard"
        }
      ]
    }
  }

As mentioned in the response, I also attempted to manually reroute the shards onto nodes, but to no avail.

Not overly sure if I'm missing something from the docs about this or if this is a bug (I know this is still quite experimental)

TIA,

-Dan Miles

DavidTurner · January 25, 2021, 5:47pm

Are you using a source-only repository? If so, that's not going to work with searchable snapshots -- the whole point of source-only repositories is that they drop all the data structures needed to support searching.

I opened an issue to improve the error reporting in this case:

github.com/elastic/elasticsearch

Prevent interaction between source-only repositories and searchable snapshots

opened 05:47PM - 25 Jan 21 UTC

closed 01:29PM - 26 Jan 21 UTC

DaveCTurner

>bug :Distributed/Snapshot/Restore Team:Distributed

Source-only repositories and searchable snapshots don't mix, but today we don't …have any special handling to prevent users from trying to mix them. A [user on the forums reported](https://discuss.elastic.co/t/issue-using-searchable-snapshots-in-an-ilm-policy/262001) that this does lead to errors but they're pretty obscure. We should decide whether to support this combination or not. If not then we should reject it in a clearer and more actionable fashion.

3266miles · January 26, 2021, 3:33am

Ah excellent spot. Yeah it should have been apparent to me that they'd never play together for sure.

It's not awfully clear on the docs either that that won't end well. Hopefully this issue should serve for posterity.

Thanks David

system · February 23, 2021, 3:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Searchable Snapshot ILM ERROR Elasticsearch ilm-index-lifecycle-management , elastic-stack-searchable-snapshots	1	688	July 14, 2021
Trouble with searchable snapshot Elasticsearch ilm-index-lifecycle-management , elastic-stack-searchable-snapshots	3	517	May 31, 2022
Searchable snapshots - cold phase - not working as expected Elasticsearch ilm-index-lifecycle-management	2	782	August 23, 2021
Snapshot Search vs Cold Nodes Elasticsearch ilm-index-lifecycle-management , slm-snapshot-lifecycle-management	2	489	December 9, 2020
Red indices when restoring from snapshot which contains indices in cold phase Elasticsearch	9	1256	August 30, 2021

Issue using searchable snapshots in an ILM policy

Related topics