Elasticsearch AWS availability zone awareness

Hello,

I have a question regarding elasticsearch AWS availability zone awareness
which I will attempt to describe with an example.
Let's suppose that we have an elasticsearch cluster running on eu-west-1
with 2 nodes (ec2 instances) per zone (3 availability zones, 6 nodes in
total).
To my understanding, in the event of a zone failure if shards were
replicated between nodes of the same zone the cluster will not be
able to completely recover.
Is this the case or elasticsearch is aware of the AWS zones in a region and
tries to replicate shards to nodes in different zones?

Regards,
Nick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Correct. You need to configure awareness in order to make each area as
independent as possible, so that you possibly have a whole copy of the data
in each zone.

Have a look at the reference to know how to do
it: Elasticsearch Platform — Find real-time answers at scale | Elastic
.

Cheers
Luca

On Thursday, October 10, 2013 10:37:51 AM UTC+2, nicktgr15 wrote:

Hello,

I have a question regarding elasticsearch AWS availability zone awareness
which I will attempt to describe with an example.
Let's suppose that we have an elasticsearch cluster running on eu-west-1
with 2 nodes (ec2 instances) per zone (3 availability zones, 6 nodes in
total).
To my understanding, in the event of a zone failure if shards were
replicated between nodes of the same zone the cluster will not be
able to completely recover.
Is this the case or elasticsearch is aware of the AWS zones in a region
and tries to replicate shards to nodes in different zones?

Regards,
Nick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for the quick reply Luca. It looks like what I was looking for.

Regards,
Nick

On Thursday, October 10, 2013 1:22:24 PM UTC+1, Luca Cavanna wrote:

Correct. You need to configure awareness in order to make each area as
independent as possible, so that you possibly have a whole copy of the data
in each zone.

Have a look at the reference to know how to do it:
Elasticsearch Platform — Find real-time answers at scale | Elastic.

Cheers
Luca

On Thursday, October 10, 2013 10:37:51 AM UTC+2, nicktgr15 wrote:

Hello,

I have a question regarding elasticsearch AWS availability zone awareness
which I will attempt to describe with an example.
Let's suppose that we have an elasticsearch cluster running on eu-west-1
with 2 nodes (ec2 instances) per zone (3 availability zones, 6 nodes in
total).
To my understanding, in the event of a zone failure if shards were
replicated between nodes of the same zone the cluster will not be
able to completely recover.
Is this the case or elasticsearch is aware of the AWS zones in a region
and tries to replicate shards to nodes in different zones?

Regards,
Nick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

We do this as follows:

Somewhere:

zone=/usr/bin/curl -s /dev/null http://169.254.169.254/latest/meta-data/placement/availability-zone
export AWS_ZONE=$zone

config/elasticsearch.yml:

node.awszone: ${AWS_ZONE}
cluster.routing.allocation.awareness.attributes: awszone

Gotchas:

You have to set the minimum master nodes to be 1 greater than the maximum
number of nodes you have in a zone. That prevents elastic search from
nominating a new primary unless it can see two zones.

That's because it turns out that zones lose communication with each other
on a regular basis. Not for long, but long enough for Elasticsearch to
detect it and split brain especially if you have it set to the default of

  1. We have 4 zones, 6 nodes. 2 West zones of 2 nodes each, 2 East zones of
    1 node each. East would lose contact with West and split brain.

You need 0.90.5 at least, because there's a bug fix in there relating to
allocating shards that wouldn't balance if you didn't have an equal number
of nodes in each zone.

There may be an additional problem relating to primary allocation, as I now
have all the primary shards on one node.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.