It looks like Elasticsearch is aware of the imbalance and is moving shards to address it. This rebalancing process takes time.
$ grep es-shards.txt -e RELOCATING
daas-arm-prod-users-2019-10 4 r RELOCATING 13259050 59.4gb 10.187.72.6 arm-lc-004_data -> 10.187.72.4 Be-sYmy7TJquJWCAmZ2aSA arm-lc-002_data
daas-arm-prod-users-2019-09 3 p RELOCATING 16503328 69.7gb 10.187.72.6 arm-lc-004_data -> 10.187.72.4 Be-sYmy7TJquJWCAmZ2aSA arm-lc-002_data
Some of your indices have far too many shards that are far too small. E.g. daas-arm-int-dataflow-2019-10
has 10 shards all smaller than 70MB. Indices like this should have one shard.
You have some excessively tiny daily indices too, e.g. kafka-metrics-*
and jmx-*
. These would be better as one-shard monthly indices.
Other shards look to be time-based but are surprisingly large. E.g. daas-arm-prod-users-2019-10
has 20 shards all in the region of 60GB. 60GB shards are fine, but why put 20 of them into one index? Would you be able to have fewer shards at once and use rollover to start a new index when the old one gets too big?