Hi Guys,
We are currently planning on expanding our elasticsearch cluster. We have two options : Option 1 : Create one big cluster (about 10 nodes) with hot warm architecture. Option 2 : Create multiple clusters (one hot cluster with X-Pack gold, one warm cluster with X-Pack basic and one cold Cluster with X-Pack basic) migration of data would be triggered via curator and reindex-api. Searches would be done via cross cluster search.
Reindexing into a separate warm or cold cluster makes these node perform indexing, which is exactly what the hot/warm/cold architecture is designed to prevent. Moving data through the use of snapshot and restore would remove this problem, but might be more difficult operationally. I am also not sure if cross-cluster search can work across clusters with different license levels.
I would therefore recommend option 1 and likely save yourself a lot of hassle and potentially exotic and unusual problems.
Thanks @Christian_Dahlqvist This is to rollover old data to less expensive nodes. My client has a standy by warm elastic cluster that they keep updated with batch data - running the batch job twice once for primary and then for secondary (off course sometimes the two are out of sync). Anyway, they have used the warm standby when users have reported holes in data on primary. They then debug the issue, correct the data on standby and then move the users to the other cluster. Understand this is a expensive solution but is there another way to given them this flexibility?
@Christian_Dahlqvist the blog doesn't mention cluster in 3 zone but yes I get that a failover cluster cluster can be achieved in 3 zone cluster. What would be the best way to utilize 9 nodes- 6 data nodes (right now in 2 separate clusters) to accomodate the usecase mentioned earlier. 3 data nodes are sufficient for data and 1 replica? This is all on-premise. Is bringing down 3 nodes and creating a cluster as needed from snapshot a feasible option for troubleshooting and eventually fallback to when fixed?
The blog assumes a single cluster with 2 types of data nodes. On top of this you typically also have dedicated master nodes. New indices are created on a set of nodes in the cluster that we refer to as the hot zone. These nodes do all indexing and holds data for a reasonably short period of time. Once they are read-only, they are migrated to the other nodes in the cluster (warm zone) using shard allocation filtering.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.