I have a five node cluster recently upgraded from ES 1.7 to ES 2.3.0.
Each node has 13 gig of memory dedicated to ES. I have observed that making a large search query will cause a node to exit the cluster. Error logs make reference to zen ping being unable to reach the missing node. Restarting the missing node will allow it to rejoin the cluster, but it subsequently hangs on re-sharding.
Any help in troubleshooting and understanding this issue would be greatly appreciated.