Elasticsearch curator restoring ".security" index during restore action

I am using Elasticsearch curator 5.8 to take snapshot on region1 and restore on region2. Recently, cluster was upgraded to 7.14, and I needed to upgrade curator to 5.8 (min 5.7 is needed).

Now, when I am restoring a snapshot to region2 curator is replacing/overriding ".security" index and hence admin user which has roles set to "admin" stops working and causes restore failures.

  1. Why is curator restoring and replacing ".security" index and causing all credentials mismatches on region2?
  2. Is there any way to exclude restoring this index? I tried rename_pattern option but it does not help. My guess is, rename happens after it replaces index on region2.

@theuntergeek can you please help?

It sounds to me like this is in Elasticsearch rather than Curator. If it worked before, but doesn't now, that suggests Elasticsearch, because Curator hasn't changed.

The easiest way forward is to exclude .security from being snapshotted on region1 (but still store it in a special separate snapshot), then restoring your other data on region2 would never include the .security index.

Thank you @theuntergeek for your response.
Yes, I want to exclude that index somehow. But I have not found a way to exclude this inside the snapshot action yml. Can you point where exactly exclude needs to go?

The snapshot action does allow for index filtering using a filter block. What you would need would be a pattern filter.

Something like this would exclude anything starting with .security:

- filtertype: pattern
  kind: prefix
  value: "\.security"
  exclude: True

Escaping the period . becomes necessary because in regular expressions, a period means "any character."

@theuntergeek I tried this filtertype. I see that if I am doing snapshot with this fitertype or without it, I do not see ".security" index being snapshotted in the curator log. But when I do restore, I still see that this index being restore in the curator log.

Snapshot log:

INFO Preparing Action ID: 1, "snapshot"
INFO Trying Action ID: 1, "snapshot": >- Snapshotting indices
INFO Creating snapshot "curator-20211002031732" from indices: ['test_index_1', 'test_index_2','abc_1','abc_2']
INFO Snapshot curator-20211002031732 still in progress.
INFO Snapshot curator-20211002031732 successfully completed.
INFO Action ID: 1, "snapshot" completed.
INFO Job completed.

And, this is restore of this snapshot:

INFO      Preparing Action ID: 1, "restore"
INFO      Creating client object and testing connection
INFO      Instantiating client object
INFO      Testing client connectivity
INFO      Successfully created Elasticsearch client object with provided settings
INFO      Trying Action ID: 1, "restore": >- Restore all indices
INFO      Restoring indices "['test_index_1', 'test_index_2','abc_1','abc_2','.security-7']" from snapshot: curator-20211002031732
ERROR     Failed to complete action: restore.  <class 'curator.exceptions.FailedExecution'>: Exception encountered.  Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Unable to obtain recovery information for specified indices. Error: AuthenticationException(401, 'security_exception', 'unable to authenticate user [admin] for REST request [/test_index_1,test_index_2,abc_1,abc_2,.security-7/_recovery?human=true]')

I tried filtertype on the restore config as well. But still not helping and failing with the same error as seen above. Please check and let me know how to not restore this ".security" index.

and

This does not compute. Please set up DEBUG logging so I can see whatever else might be happening here, because .security-7 should not have been captured according to that list from the snapshot run.

I also need to see the full action definition. Are you including cluster state? That might automatically include the security indexes now, and should not be used or needed for this use case.

Yeah, my guess is the same. Before upgrading to 7.14 when Curator used to take snapshot, it never replaced security index. I will try disabling the cluster state, take snapshot and restore to see if it works.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.