I agree with Aaron's recommendations in that you have far too many shards and need to reduce this dramatically. Please read this blog post on shards and sharding practices and then change how you create and manage indices. I would recommend changing to daily indices in order to increase the average shard size as per the recommendations in the blog post. We are also doing a webinar this Thursday on this topic, which may be useful.
Elasticsearch keeps a lot of data off-heap, which means that the size of the file system cache is important for good performance. You will need to adjust the amount of heap you give nodes if running multiple nodes on a host with just 64GB RAM. As Elasticsearch by default assumes all nodes are equal, you may want to run 2 nodes, each with 16GB heap, on all hosts. Another option might be to rearrange your disks and run a single node, hot or warm, per host with a 30.5GB heap.
Forcemerging down to a single segment can save a lot of heap and is definitely recommended. It is however as Aaron describes a quite expensive and slow process and uses a lot of disk I/O.