Fundamental question about ES data/shards


From my understanding, are all the data in ES stored in the shards? And the data is not redundant, so each shard has only a portion of the total data set (not counting replicas)? I ask because we want to replace 2 nodes in our cluster with 2 beefier nodes, and repurpose the 2 older servers. Our plan was to add 1 server to the cluster, allow the data to transfer over, and then add the other 1 since there is a significant amount of data. I wasn't sure the best way to go about this, would it be better to disable shard reallocation? If shard reallocation is disabled, then theoretically no data would get transferred, correct? How do we ensure all the data gets copied over? I did some searching, but I wasn't able to find a way to manually move shards. I suppose we could just copy the physical files over, and have ES recover and pick them up, but I'm not sure if that's the best way to go about it.

(Nik Everett) #2

I'd add the other two nodes to the cluster and then use
cluster.routing.allocation.exclude to remove the shards from the old nodes
and then shut them down.

On thing to keep in mind is that cluster.routing.allocation.exclude causes
elasticsearch to push the shards off of the excluded nodes with quite a lot
of haste. It'll move them to all of the other nodes and then the balancer
will kick in and start balancing the load. So if you are close to the edge
on power you may want to just add the new nodes and wait a while for the
balancer to balance shards to those nodes and then use
cluster.routing.allocation.exclude to clear the shards off the old nodes.
That way the nodes will get allocated relatively evenly.



Thanks for the suggestion Nik. So the cluster.routing.allocation.exclude command will cause the shards to go to any other node? Say we only want join 1 node at the beginning, if we use cluster.routing.allocation.exclude then the shards will get pushed to the new node AND the other old node that's still there? Or, if we do add both nodes and use cluster.routing.allocation.exclude on both old nodes, then there's a possibility that all of the data could get sent to just one node, and then balancing would balance them out across both nodes?

(system) #4