Indexing slows while rebalancing?

(Ecweaver) #1

We have a large-ish cluster used for log search, document size usually a few hundred bytes max.

We have had to re-spin nodes, causing recoveries and rebalances.

During rebalance, and sometimes at points in the recovery phase, indexing slows to the point where log lines get dropped (since the senders are configured to drop rather than queue).

Is this indexing slowdown expected? Is there any remedy for this other than set up Kafka (or Logstash 2.0)?

We are using AWS, 20 i2.8xlarge instances split across two availability zones. We have rebalancing throttled to 2 streams and recovery to 8. Would appreciate any insight into what's going on here.

(Ecweaver) #2

Turns out rebalancing (as such) was not the issue, the issue was that a new day's index had all its shards on one or two nodes, and due to the imbalance in the data loading, the primary shards did not get distributed out in the normal way. One or two nodes were taking all the indexing.

I hand-rerouted those primary shards and indexing speed went back to normal.

It's a thing to watch out for...

(system) #3