Curator Snapshot Error with Closed indices

wanshishi · June 1, 2018, 4:43pm

Snapshot is having an error with close indices.

I have this curator script that runs multiple actions. I am not sure if I am doing it correctly or the order of the action affects how it ran.

allocation
forcemerge
close
snapshot
delete

What happens is when the cron for curator runs, it will perform action 1,2 and 3, then when starting snapshot, I am having this error.
ERROR Failed to complete action: snapshot. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: TransportError(403, u'cluster_block_exception', u'blocked by: [FORBIDDEN/4/index closed];')

To go around this error, i tried to re-open the indices, then ran snapshot afterwards and it went fine.
Now when my script re-run again, it will fail, because it will close the indices again.
Is there a way for snapshot action to ignore close indices? Or is there a better solution?

actions:
1:
action: allocation
description: "Apply shard allocation filtering rules to the specified indices. Move files older than 2 days"
options:
key: node_type
value: warm
allocation_type: require
wait_for_completion: true
timeout_override:
ignore_empty_list: true
continue_if_exception: false
disable_action: false
filters:

filtertype: pattern
kind: prefix
value: logstash-

filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 2

2:
action: forcemerge
description: >-
forceMerge logstash- prefixed indices older than 1 day (based on index
creation_date) to 1 segments per shard. Delay 120 seconds between each
forceMerge operation to allow the cluster to quiesce.
This action will ignore indices already forceMerged to the same or fewer
number of segments per shard, so the 'forcemerged' filter is unneeded.
options:
max_num_segments: 1
delay: 120
timeout_override:
ignore_empty_list: true
continue_if_exception: false
disable_action: False
filters:

filtertype: pattern
kind: prefix
value: logstash-
exclude:

filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 1
exclude:

3:
action: close
description: >-
Close indices older than 90 days (based on index name), for logstash- and syslog- prefixed indices.
options:
delete_aliases: False
timeout_override:
ignore_empty_list: true
continue_if_exception: false
disable_action: False
filters:

filtertype: pattern
kind: prefix
value: logstash-
exclude:

filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 90
exclude:

4:
action: snapshot
description: "Snapshot logstash- older than 91 day based on index creation_date.Wait for snapshot to complete."
options:
repository: es_backups
name: logstash-%Y%m%d%H%M%S
ignore_unavailable: false
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
disable_action: False
filters:

filtertype: pattern
kind: prefix
value: logstash-

filtertype: age
source: name
timestring: '%Y.%m.%d'
direction: older
unit: days
unit_count: 91

5:
action: delete_indices
description: >-
Delete indices older than 91 days (based on index name), logstash- and syslog-
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:

filtertype: pattern
kind: prefix
value: logstash-
exclude:

filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 91
exclude:

Specifications
Version: ES Version 6.0.0
Platform: Ubuntu
Curator 5.5

theuntergeek · June 1, 2018, 5:06pm

Elasticsearch cannot snapshot a closed index, which the error message you received makes plain.

You have two options. Which you choose will depend on the desired outcome.

Re-order your actions so snapshot comes before close. Actions are performed in the order in which they appear.
Add a closed filter to your snapshot filter list.

Option 1 guarantees that the indices you just forcemerged get snapshotted. This is what I presume you wanted to have happen. Option 2 will simply tell the snapshot action to ignore closed indices. If you want the closed indices to be snapshotted, this is perhaps not the option you want to go with. You should simply re-order the actions to close the indices after they've been snapshotted.

wanshishi · June 1, 2018, 5:35pm

I tried the option 1 and it works the first run. but the next schedule to run the cron again, it will fail because there are closed indices now. Will the delete action deletes the closed indices too?

theuntergeek · June 1, 2018, 6:35pm

The delete_indices action will only delete indices you have identified by your filters.

Snapshotting at older than 91 is just to save before deletion? Since you are snapshotting, then deleting indices older than 91 days, and you are closing older than 90 days, it seems to me that the close action is unnecessary. A closed index for only 1 day is not going to change much.

These numbers are rather large. Can your cluster handle 90 days of open indices, in terms of shard count per node?

wanshishi · June 1, 2018, 6:56pm

Yes the snapshot is to save indices before deletion (somewhat like backup). And yes my cluster can handle 90 days of open indices.

So in your perspective, there is no point in closing the indices.

Thank you, I am learning something...

theuntergeek · June 1, 2018, 7:34pm

Closing the indices is something to do if you couldn't handle that many open indices, even though your storage can handle it. Each open shard has a resource cost (heap space), so the more you have, the more memory constrained your nodes become for indexing operations. In a cluster that could hold 90 days worth of indices, but could not keep more than 30 days worth of indices in an open state, you'd snapshot at 30 days, then close the indices to keep them present, but not used.

wanshishi · June 1, 2018, 8:51pm

Thank you for the clarification master @theuntergeek

system · June 29, 2018, 8:51pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Curator snapshot - "Schema Error" Elasticsearch	2	689	April 5, 2018
Curator Elasticsearch 404 on delete_snapshots action in AWS Elasticsearch Elasticsearch	7	1124	June 5, 2018
Failed to complete action: snapshot. <class 'KeyError'>: 'indices' Elasticsearch snapshot-and-restore	3	111	June 12, 2024
ERROR Failed to complete action: restore. <class 'curator.exceptions.FailedExecution'>: Exception encountered Elasticsearch curator	2	899	August 11, 2021
Curator Restore action failure Elasticsearch	4	611	January 8, 2019

Curator Snapshot Error with Closed indices

Related topics