Curator fails to shrink index


#1

Hello,

I wanted to test the shrink action with curator which fails with:

2017-09-11 09:04:56,314 ERROR curator.cli run:184 Failed to complete action: shrink. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Insufficient space available for 2x the size of index "some-logs-000015", shrinking will exceed space available. Required: 13960003472, available: 0

There is another message which might be of interest:
2017-09-11 09:25:08,971 INFO Node "{0}" has multiple data paths and will not be used for shrink operations.

There is definitely enough space available - _GET nodes/stats/fs?human&pretty returns:
...
"fs": {
"timestamp": 1505113876747,
"total": {
"total": "14.3tb",
"total_in_bytes": 15747518529536,
"free": "14.1tb",
"free_in_bytes": 15568670572544,
"available": "14.1tb",
"available_in_bytes": 15568603463680,
"spins": "true"
},
...

The action file:

actions:
1:
action: shrink
description: >-
shrink indices
options:
disable_action: False
ignore_empty_list: True
shrink_node: DETERMINISTIC
node_filters:
permit_masters: True
number_of_shards: 1
number_of_replicas: 1
shrink_prefix:
shrink_suffix: '-shrink'
delete_after: True
wait_for_active_shards: 1
extra_settings:
settings:
index.codec: best_compression
wait_for_completion: True
wait_interval: 9
max_wait: -1
filters:

  • filtertype: pattern
    kind: prefix
    value: some-logs-000015

Any ideas?

Regards,
M


(Aaron Mildenstein) #2

Having multiple data paths is problematic for a shrink operation, if I understand it correctly. What has to happen for a shrink to work is that all of the shards must be in the same data path. If you have multiple data paths on the same node, some of the shards or replicas could end up on a different file system, which would result in a shrink not working properly.

If you want to shrink to a single shard on a node with multiple data paths, you're probably better off using a reindex action instead. You'd have to create the new index with routing parameters first, to force it onto that particular node, and then do the reindex operation.


(Aaron Mildenstein) #3

I did some asking, and it appears I can override this, but it will amount to much the same thing as a reindex, in the end (emphasis added):

Shrinking works as follows:

  • First, it creates a new target index with the same definition as the source index, but with a smaller number of primary shards.
  • Then it hard-links segments from the source index into the target index. (If the file system doesn’t support hard-linking, then all segments are copied into the new index, which is a much more time consuming process.)
  • Finally, it recovers the target index as though it were a closed index which had just been re-opened.

The problem with trying to get Curator to do this is that it still requires a guarantee that one of the data paths has sufficient space for 2x the index space because shards cannot span filesystems—they must exist entirely on a single filesystem. Curator would have no control over which filesystem is chosen, or how to ensure that there is sufficient space on that filesystem. It is possible to test and enforce that all data paths have sufficient space to accommodate the 2x index space needed, but is that the right solution? I don't know. I do know that a shrink will fail if an attempt is made and the filesystem the single shard is targeting does not have sufficient space, even if the sum total of other filesystems does add up to sufficient.

Shrink is exceptionally complicated, and this is just one more reason it is difficult. I will have to consider and consult with some others here at Elastic and see what, if any, solution there should be to shrink on a node with multiple data paths.


#4

Each data path used in this setup does have sufficient disk space for this particular operation.

Thank you for taking a look at this, Aaron.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.