Is it safe to delete SLM history indexes

Hi,

I have added a new master-eligible node (say node04) and was decommissioning another master-eligible node (say node03), due to a different es version, from the cluster of 3 nodes. So I used the cluster allocation filtering to exclude node03 and i noticed the shards from node03 starts relocating to other nodes (good news). However, the process of relocation stopped and I found that one of the slm-history index cannot be relocated due to the following error:

   "index" : ".slm-history-3-000001",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "ALLOCATION_FAILED",
    "at" : "2021-01-22T22:14:38.964Z",
    "failed_allocation_attempts" : 5,
    "details" : "failed shard on node [p0HAemBmS4WztNbw_Oelcw]: failed to create index, failure IllegalArgumentException[unknown setting [index.hidden] please check that any required plugins are installed, or check the breaking changes documentation for removed settings]",
    "last_allocation_status" : "no"

And each of the node is showing the following error:

      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-01-22T22:14:38.964Z], failed_attempts[5], failed_nodes[[p0HAemBmS4WztNbw_Oelcw, NTCrJwD7TXeOsdFHErCBvg]], delayed=false, details[failed shard on node [p0HAemBmS4WztNbw_Oelcw]: failed to create index, failure IllegalArgumentException[unknown setting [index.hidden] please check that any required plugins are installed, or check the breaking changes documentation for removed settings]], allocation_status[deciders_no]]]"

I also noticed that this particular SLM index (.slm-history-3-000001) does not have shards in any node. I can't perform the allocate_stale_primary or allocate_empty_primary.

"type": "illegal_argument_exception",
    "reason": "No data for shard [0] of index [.slm-history-3-000001] found on any node"

I was thinking to delete the index. I am ok with the data loss as there are new recent slm indexes being rolled over. Is it safe to delete this particular slm index or are there any suggestions how to fix this problem?

I tried shutdown node03 (in hope that things will rebalance itself) but i was getting "All shards failed" error and it doesn't seem to be recovering. So i brought up node 3 again and everything was running again. It seems I am stuck with node03 in the cluster, which I want to decommission (and it's currently in the exclusion list). Any idea guys?

Also, tried re-run the reroute of the failed shards but was unable to retry them

"description" : "index write (api)",
            "retryable" : false,
            "levels" : [
              "write"

any ideas?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.