I have a cluster running 1.4, recently upgraded from 1.1 (a week ago). The cluster is on a 10.100 subnet. The web servers utilizing the cluster are on a 172.31 subnet. Yesterday we started experiencing very slow network response times, only between the web servers and the ES nodes. This happens with all web resources served by the nodes, such as plugin js files, not just search results. The servers had heavy, sustained traffic over the weekend, hundreds of thousand per hour.
Is there some sort of ddos protection built into es/netty that could be causing this? It's behaving like it is being heavily throttled. It's sandbagging or trickling the reply. I have run iperf between the webs and es nodes and get near gigabit/s steadily.