I want to set my index template to 6 shards, with 2 replicas. Then I will
set max shards per node to 3, so I split the index evenly amongst the 6
nodes.
However, if there any way to say something like: maximum primary shards: 1 ?
What I see now, is that some nodes only end up with replicas. I'd like for
each node to have 1 primary, and 2 replicas.
yet, I wonder if this is really needed since a primary is really just a
boolean value on a node metadata that causes an operation to be exected on
that shard
first. The load should be the same on all node no matter if they have
primaries or replicas
simon
On Sunday, October 20, 2013 1:21:59 AM UTC+2, Bruce Lysik wrote:
Hi,
Say I have 6 Elasticsearch data nodes.
I want to set my index template to 6 shards, with 2 replicas. Then I will
set max shards per node to 3, so I split the index evenly amongst the 6
nodes.
However, if there any way to say something like: maximum primary shards: 1
?
What I see now, is that some nodes only end up with replicas. I'd like
for each node to have 1 primary, and 2 replicas.
yet, I wonder if this is really needed since a primary is really just a
boolean value on a node metadata that causes an operation to be exected on
that shard
first. The load should be the same on all node no matter if they have
primaries or replicas
So you're saying that in my setup for logstash, as long as every node has
the same number of today's shards, the load will be the same on each node?
If that's the case, then yes, I wouldn't bother tweaking any further.
yet, I wonder if this is really needed since a primary is really just a
boolean value on a node metadata that causes an operation to be exected on
that shard
first. The load should be the same on all node no matter if they have
primaries or replicas
So you're saying that in my setup for logstash, as long as every node has
the same number of today's shards, the load will be the same on each node?
If that's the case, then yes, I wouldn't bother tweaking any further.
so the responsibility of a primary is to execute the index operation first
before it's send to the replicas but eventually everybody does the same
work at some point. So I wouldn't worry too much. That is also the reason
why we don't bother distribute them in a balanced fashion but rather try to
place even number of shards globally and per index on the nodes. It's
really just a tie-breaker in the balance algorithm
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.