So I have done several things to expand the cluster. I increased sharding, added warm nodes, and increased the CPU load on the hot nodes by doubling the instance type.
What I keep seeing is a single data node with around 90% cpu usage. there are 6 hot data nodes, but this specific one has almost twice the data on it than any of the others, and I don't know why. I haven't touched any of the default settings as far as re-balancing, and most of my smaller indices are 1 primary+1 replica shard. It's the large cluster sending it logs that is taking up all of the space. So it really comes down to a single multi-sharded index somehow taking up the space and not getting balanced properly.
Of course, I am not sure if one has nothing to do with the other... meaning the high CPU may not be because of the higher amount of space used, but I can't find anything else other than write and lucene taking up the hot threads.
Along with that, one specific node has a much higher document count, CPU load, IOPS, and segments. Maybe this specific node mostly has shards related to the large index and it is balancing on total shards for the cluster and not based on the 1 huge index.