I'm running a large multinode cluster with both shard allocation awareness and shard allocation filtering. Looking at my shard allocation via the elasticsearch-head plugin, I can see that the shard allocation awareness is not taking effect, only the filtering. I have shards with both the primary and replica in the same AWS zone.
The cluster nodes correctly identify the attributes
# Cluster node info
{
"name": <my data node>
"attributes": {
"aws_availability_zone": "us-east-1a", # What I'm trying to use for shard awareness
"tier": "1", # my shard filter parameter is tier
"master": "false"
}
}
#elasticsearch.yml
cloud.aws.access_key: XXXXXXXXX
cloud.aws.secret_key: XXXXXXXXX
cloud.node.auto_attributes: true
cluster.routing.awareness.attributes: aws_availability_zone
Version info: ES 1.4.3, elasticsearch-cloud-aws plugin 2.4.1, all nodes in us-east-1 region
Just another datapoint:
I have another smaller cluster with Elasticsearch version 1.7.1 with plugin version 2.7.0 and ran into a similar issue initially. I created 3 nodes in 3 AZs (same cloud settings above), an index with 5 shards, and a replication factor of 1 (primary + replica), and then added 3 new nodes in the original 3 AZs. This cluster was also "off balance" with primary and replica being stored in the same AZ. Creating a new index with 6 shards across all 6 nodes appears to be properly balanced between availability zones. This cluster does not have shard filtering set on the indices.
I would have thought that perhaps the additional nodes threw off the balancing, but all of the indices on my 1.4 cluster were created after all of the nodes existed.
How do I properly balance shards across AWS availability zones?