Consolidate shards of offline indexes

(Eric Brunson) #1

We have a six node cluster on which we have date partitioned logstash data going back 3 or so years. All of these indexes but the last six months of indexes are closed, but we wish to keep the data around for the foreseeable future. We recently upgraded the storage on three of the nodes and wish to make the other 3 nodes dataless, but these that will become dataless still have shards of the closed indexes.

It's tedious and error prone to try to open each index and back it up and attempts to automate the process have resulted in performance problems. Is there a way to consolidate these shards on to the three data nodes without bringing them online? Is it safe to simply copy the data over outside of elasticsearch?


(Mark Walkom) #2

That might work, when you close an index it takes the location of the shards out of cluster state and when you open it it then looks for the relevant shards.
You can try at least, just make sure you have backups if the data is important.

What errors do you get when you open them now though?

(Eric Brunson) #3

It isn't so much that there are errors right away, but the indexes are large with many replicas so it takes a minute or two for the index to come online and the number of initializing shards can block new indexes from getting their shards allocated leading to issues with them. Honestly, it's mostly that I haven't taken the time to do the right polling of the index state to see whether all shards are finished allocating before I try to back up.

I should probably just try it, as you say. I'll set up a test cluster and see if it works, it's just a little nerve wracking to know that a screw up can lose the historical data in production.

Thank you for your input.

(system) #4