I decided to post a question about ES cluster(s) design because I think my actual design sucks a bit.
My goal is to set up ES infrastructure which will:
- be able to index/update 10 000 - 100 000 (and more) daily, there are also data (100 000 - 1000 000) which will be used for reading only
- be failure resistant (I understand there are many levels to achieve this, but let's talk about a basic failure resistant design)
I can run 8 nodes at this moment in total (my resources limitation).
My design #1:
- 2 clusters, run on separated machines
- each cluster contains 2 master-only-eligible nodes, 2 data nodes
- usage of design: cluster1 and cluster2 contain mirrored data, when update/index is needed I will tell my application to use cluster1 and I will update cluster2, if everything is OK, I will tell application to use cluster2 (and do synchronization of cluster2->cluster1)
- goal is to have a data backup and do not limit application when large index/update is running
- problems: even number of data nodes (i know about split brain problem), quite complicated data migration
My design #2:
- 1 cluster
- 3 master-only-eligible-nodes, 5 data nodes, nodes run on separated machines
- here comes my questions, how to design update/index operations scenario, but do not throttle or limit application in background and still have a backup
- would you suggest using aliases with backup&restore function of ES ?
(I have accidentally sent a non completed post, sorry).