I am designing a solution based on many smaller Elasticsearch engines scattered around the world, and a single instance containing all of the data from all of the instances combined. I do not need the data to be up to date by the second, but do require eventual consistency in a minute or so.
Each smaller instance is a single cluster with either 1 or two nodes.
What is the best approach? Should my central instance be a cluster and all the other smaller clusters sync their data into it? Is there such a feature: cluster of clusters?
If this is not the best way, what are the alternatives?
Should I perhaps use one of the Beats to push the data from each individual instance to the central one?