Since upgrading to Elasticsearch 7.4.1 from 2.4. (I know ) we are seeing our AWS availability zone transfer charges skyrocket. Both the old and new clusters used the cluster.cloud.allocation.awareness.attributes: aws_availability_zone setting and are deployed across 3 AZ's within an AWS region.
What I'm wondering is, did we miss some "compress internal communication" setting or similar for shard shuffling?
Does ES7 move more shards than 2.4? We have cluster.routing.allocation.node_concurrent_recoveries: set to 2 for both, but didn't notice much difference if we went to 8.
So we set up a little lab:
ES2 - 2 data nodes in different AZs
ES7 - 2 data nodes in different AZs
ES7 with transport.compress true - 2 data nodes in different AZs
Ingested some data, then deleted it ready for our tests. Using packetbeats to monitor port 9300 on the data nodes.
ES2 idle before we start re-ingestion transferred 156 MB between nodes
ES7 idle transferred 60GB!!
ES7 idle with compression transferred 25GB!!
Re-checking our numbers now, but that's crazy. What is it sending with no indices that's 60GB?
Going to check again with self-monitoring off on ES7 and some other things that may be different between 2->7
So this is very telling ... from a blank slate, we ingested the same set of data (Shakespeares works) into both ES2 and ES7 (with compression on)
The initial ingestion is similar but then suddenly the data nodes on ES7 continuously transfer data around ... and ES2 is done.
Yes, this seems unexpected indeed. Can you disable compression and grab a full packet capture of the network traffic in this experiment using tcpdump? I'm very curious why we're transferring MBs of data so frequently.
There look to be several hundred indices in this cluster so the index stats and recovery stats are ~200kB, so that's more than half of the 4MB file you shared since something's requesting those stats periodically. There's some write traffic to monitoring-beats-7-2020.04.08 and some read traffic from Kibana. I don't see anything particularly unexpected here, however, Elasticsearch is apparently doing quite a bit of work to serve client requests.
Based on your results there (Is that tool available somewhere? ) we set our xpack.monitoring.interval to 60s from the default (10s). I think you can see where that happened! That should help for sure. We may even go 120s ...
You probably have it already. I just used tcpflow to split the pcap file up, then cat * | strings | grep '\(cluster\|indices\):' | sort | uniq -c to find all the things that looked like action names. Then a few spot checks with Wireshark (Statistics -> TCP flow graphs) to measure some approximate message sizes.
I'm not sure there's too much value in optimising monitoring traffic as you suggest: a few MB per minute is pennies per day in cross-AZ traffic costs, and normally completely swamped by actual production traffic. I don't think this is the reason for the 60GB of traffic that you mentioned above.
Last update, with transport.compress on we are seeing the cost benefits we were hoping for with only a slight increase in CPU. Still not sure what was happening with the lab tests, but in production we seem to be at an acceptable data rate.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.