I'm not entirely positive, so wait for someone with more experience to
confirm/deny...but I don't think this is quite possible in ES right now.
You can fake it with, shard allocation filteringhttp://www.elasticsearch.org/guide/reference/modules/cluster.html,
multiple indices and aliases, however.
First, let's talk about the solution that appears to work, but in fact
does not: forced awareness settings. Forced awareness basically prevents
duplication of data within the same zone, so a primary + replica cannot
live in the same zone.
Imagine you have two nodes in your "indexing" zone, and two nodes in your
"search" zone. Primary shards are allocated in "indexing", replicas on
"search". If you use forced awareness and a node in your "search" zone
goes down, ES will know avoid initializing a corresponding replica in your
indexing zone, since the primary already lives there.
Even better, if you perform searches on the "search" zone, forced awareness
makes ES prefer querying nodes in the same zone. Great!
However, the problem arises if one of your indexing nodes goes down. Zones
enforce data duplication boundaries, but does not interfere with primary
promotion. If one of your indexing nodes goes down, your cluster is now
missing a primary shard. ES has no choice but to promote a replica to a
primary, even if it lives in another zone. Now your indexing node is
actually living in the "search" zone and everything is all messed up.
As an alternative, what you can do is use Shard Allocation Filtering to
separate an "Indexing" index and a "Search" index onto physically separate
nodes. E.g. search nodes are forced to allocate to nodes with the "search"
tag. You then index into your "indexing" index (hah) and when it is ready
for search requests, change the tags on the index over to "search".
ES will automatically transfer the shards over to your Search nodes. When
the transfer is complete, change a top-level alias to switch between the
old and new index transparently, then delete the old index. This method
obviously has a lot of moving parts, and loads the search nodes with
periodic network transfer as you move shards around.
-Zach
On Tuesday, February 19, 2013 7:37:26 AM UTC-5, Ophir Michaeli wrote:
Hi,
My question is about the best practice to divide elasticsearch indexing
and search systems.
Currently, without elasticsearch (and with lucene), our indexing works on
several machines at one location and the search machines are at another
location.
The indexed data is copied or updated periodically from the indexing
machines to the search machines.
We want to maintain a similar structure using elasticsearch.
Is it possible for the elasticsearch nodes on the indexing and search
machines to be on the same cluster,
so the indexing nodes will put replicas on the search nodes and will keep
the replicas updated, while the search client will approach only the search
nodes (ignoring the indexing nodes because they are in a different location
and approaching them will slow the search).
Also – is it possible to set the indexing nodes on the cluster to update
the search nodes periodically (and not constantly so the search performance
won't decrease)?
Thanks,
Ophir
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.