I currently have 3x ElasticSearch nodes in a single cluster in a
single datacenter.
They have 1x active index at any time (there's an alias that I move
around) which is miniscule by most standards, it has around 4 million
documents and is ~1GB on disk and is updated daily.
There's 1 shard for the index and the replica configuration is such
that there's always a complete copy of the index on every machine, and
when I make queries I use "preference: local" to ask ElasticSearch not
to forward the query to other machines.
I'm about to add 3x more machines in a second datacenter, and I'm
wondering whether to set them all up in one cluster or have two
clusters.
The reasons I'd set it up in one cluster:
-
Easy to monitor the entire installation with one instance of
ElasticSearch Head / other monitoring tools. -
Easier to maintain it, I just have to run one cron script per day
that talks to one cluster, although running it against N clusters
would also be trivial. -
I don't really care about the slowness of the inter-DC link. It
doesn't really matter if it takes 10 minutes or 1 hour to populate
the index. While it's being populated I have yesterday's index
around serving requests.
Reasons not to do it:
-
Perhaps even with my setup of having a complete copy of the index
on each machine and using "preference: local" there will be some
situations (e.g. a resource exhaustion on one node) that'll cause
an ElasticSearch node in one datacenter to forward requests it
can't handle to a node in another datacenter, at which point the
inter-DC link would become a very painful bottleneck.But I have no idea whether that actually happens in practice, and I
couldn't find anything documented about this. -
If I ever start using indexes that need to be more synchronous I'd
have to change the configuration to use two clusters.
I'll probably just set it up as two clusters because it's easy to run
two cronjobs & to have the chance to start using more synchronous
indexes.
But I thought I'd post this here in case it generates some interesting
discussion, in particular I'd be interested in more details about when
ES nodes decide to forward requests to other nodes.