Could you tell me a couple of more things so I can try to reproduce this:
- how many nodes and their zone related configs you use
6 nodes across 4 zones. Replica count of 3 so I get one shard per zone somewhere
2 nodes: west-a zone
2 nodes: west-b zone
1 node: east-a zone
1 node: east-b zone
From the original screen shot you can see its all green.
East-a has 1 node, 20 shards
East-b has 1 node, 20 shards
West-a has 1 node with 19 shards, 1 with one shard
West-b has 1 node with 19 shards, 1 with one shard
stopping a node in the west causes the remaining node to get 20 shards. If I start it up again, a single shard migrates back.
Self inflicted part: In bootstrapping this cluster I brought up west first. Then I added east which started moving shards around. I realized I needed more replicas for full east/west redundancy and increased the count to 3. The ones in the east got stuck migrating, they stuck for days, so I shut them down and a stuck western node or two, one of which was the primary. That brought the cluster back to a fully green state 20 shards everywhere with some unallocated. I started up east again and it copied shards to the east quickly, possibly taking the replicas away from western nodes but without rebalancing.
Other than the replica count and the allocation awareness set to the zone listed below, our settings are pretty much the defaults.
- the index settings you used to create this index?
I wanna write a test that 'mocks' your situation if you have any custom settings please let me know. I'd also be curious if all your shards are in state "active" or if something is relocating?
is your cluster green?
simon
On Friday, August 23, 2013 9:42:50 PM UTC+2, Pierce Wetter wrote:
We're using 0.90.3 with a cluster we pretty much stood up just of this.
Here are the cluster settings: (though this duplicates the config file).
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "awszone"
},
"transient": {}
}
Sample Node:
"nCG5WzNRS6mRFF0ULbLYtA": {
"name": "Nelson, Foggy",
"transport_address": "inet[/10.98.14.42:9300]",
"hostname": "reader-elastic03.prod2.cloud.cheggnet.com",
"version": "0.90.3",
"http_address": "inet[/10.98.14.42:9200]",
"attributes": {
"awszone": "us-west-2a"
}
},
On Thursday, August 22, 2013 11:06:50 PM UTC-7, simonw wrote:
can you provide more infromation about your setup ie all the node settings etc. the elasticsearch version you are running, any allocation decider settings (curl -XGET localhost:9200/_cluster/settings)
simon
On Friday, August 23, 2013 1:50:12 AM UTC+2, Pierce Wetter wrote:
Background: I've setup the number of shards to 3, and I have allocation awareness set to match the awszone property which I'm setting in a config.
There's 6 nodes total:
2 in us-west-2a
2 in us-west-2b
1 in us-east-2a
1 in us-east-2b
So in theory, I should be able to lose either an entire side of the country. I would also expect the nodes in us-west to balance the shards, not just allocate a single shard to the machine.
Any ideas?
Pierce
--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/9yZw7sryFb4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.