I'm encountering again and again stability issues on the cluster when
rebuilding one of the nodes, this happens while the shards are re-balancing.
In more details: I have a cluster with 3 nodes. Data is by time so indexes
are created 1 per month, with 1 shard and 1 replica per index. Meaning each
node contains about 2/3 of the indexes.
This setup works great on general. Problems starts when one of the nodes
For instance, yesterday Amazon had issues in one AZ, which brought one of
the nodes down for several hours, and I had to rebuild a new node from
backup. So the new node was added to the cluster, and the cluster began
rebalancing the shards.
Now while rebalancing occurred the cluster performance degraded severely.
Indexing times went up (and even came to a halt occasionally), search times
went app, meaning bad impact on our application. At some point another node
stopped responding and was removed from the cluster, and I needed to
restart that node also, which meant more rebalancing. So bringing the
cluster back to a smooth running state takes a long couple of hours in
which performance is between bad and terrible.
Is there any way to handle this issue? Am I missing something in the
cluster configuration that can prevent these problems?