Recently I observed a single node cluster (es 6.2.4) with lots and lots of shards.
I know it's a bad practice, and I advised the relevant party to change their elastic ways,
but out of curiosity I'll appreciate if anyone can explain what I saw.
The node specs: 4 cores, 32GB, 16GB for ES heap.
Number of shards in the cluster: 14,000 in 600 indices.
Total number of docs: 4.5 billion
Total size: 700GB
When restarting the node, I observed that ES's loading rate of shards is constantly decreasing, while the cpu usage of the node is constantly increasing, starting from ~40% utilization, and quickly averaging to 90% and more.
What's more, even when the loading of all shards was complete, and the cluster became green, cpu usage did not decrease significantly.
I don't think that Heap memory was the issue, because I observed a standard jigsaw pattern. There were not many old gen collections, and after an old gen collection the heap was decreased to about 8GB (out of 16). Young gen collections were far more common (roughly 5 per second).
It's as if ES is still doing something costly even after all the shards were loaded.
Thanks