Data nodes separation (via attributes) vs. clusters separation

Hi,

We are B2B that maintain one index per each one of our customers.
Every index is being indexed every day from scratch and once it is ready, it replace the previous day index.
Once the index is ready, it's in read-only mode - used for search only.
For a better isolation, we are using two "zones" for data nodes: work zone (where shards are being indexed) and serve zone, where indices are ready for serving.
We achieve that by setting: node.attr.zone and cluster.routing.allocation.awareness.attributes: zone.
After each index becomes ready, we relocate it's shard to the serve zone and adding more replicas.

Now, as part of our cluster upgrade (major Elastic version, latest) we consider to break this architecture in to two different clusters: work-cluster and serve-cluster.
We saw that there's a cross cluster replication feature in Elasticsearch, but in our case we don't need that leader/follower pattern because we just want to move the shards to the serve cluster without saving traces in the work cluster (because there won't be any additional writes).

Is there a way to relocate index's shards between clusters? if there is, what's the pros/cons compared to the original approach? what would you recommend?

thanks!

I do not see the point in adopting cross-cluster replication here unless you are looking to set up a second cluster for redundancy in another region. If your current approach is working I would recommend sticking with this and not split clusters.

Are there any particular problems you are looking to address with this split that we are maybe not aware of?

1 Like

I agree with Christian there.

Thanks guys!

As part of our upgrade we are moving to Kubernetes (with elastic operator) and we thought it might be easier in terms of:

  • configure on-demand nodes for serving nodes vs. spots instances for work nodes.
  • configure some scale-out/in policy when we are not indexing (to save money). I know that you don't have something relevant yet, and we thought to simply uninstall the worker cluster during off hours.
  • better isolation, to avoid bad impacts in case of unpredicted load etc. (although I must say it's super stable today - knock on wood).

We can implement all those configuration in the new architecture as well, just want make sure we're not missing anything better before we start :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.