Constant Garbage Collecting

That's what it's seeing but it's not the case. I have 2 hot data nodes and 1 cold (on this cluster), but because of this issue they keep "leaving" the cluster. So the output you're seeing isn't correct. As to the heap, I have been following the recommendation to use half the physical memory. So on the three data nodes I upped the RAM to 48GB and set the min/max to 24GB in java.options.

-- I forgot to answer your question about the shards. I've been trying to do that, but my other issue is that I can't use the reindex API. But I have been forcemerging to at least reduce the segments. That works at least.