Bluk indexing performance over a high latency newtork

eugene_miretsky · July 6, 2016, 3:42pm

We have a 3 node ES cluster set up in AWS.
Running the default Rally benchmark from another instance in the same AWS VPC (using --benchmark-only) results in ~45,000 docs/s indexing rate. Running the same benchmark from the VM in out own data center over a VPN results in ~2,500 docs/s. We get even worse performance using our own benchmark - 20,000docs/s vs 200 docs/s.
The VPN latency is 100ms - that's a lot, but doesn't quite explain the huge performance difference. If we assume that each batch takes 1s to processes, the latency should add only 0.2s to the end-to-end batch processing time.
We are using the default config, so http.keep_alive is true. The network throughput is ~12MB/s (we are not even close using that).

Any advice on how to solve/debug this?

@danielmitterdorfer (for Rally advice)

danielmitterdorfer · July 7, 2016, 4:15am

HI @eugene_miretsky,

one (logical) bulk request is not just one network packet but transferred in HTTP chunked transfer encoding, which means that there are multiple network packets. So I guess your assumption that your latency of 100ms adds just 200ms is not true. I'd just capture the network packets to see what's going on. You can use Wireshark for that.

Daniel

eugene_miretsky · July 7, 2016, 7:50pm

Thanks Daniel!

I'm not exactly a networking expert, but gave Wireshark my best shot. Attached are screen shots of Wireshark over

Good connection (same VPC in AWS)

Pasted image at 2016_07_07 03_26 PM.png4826×1652 805 KB
slow network 100ms latency connection

Pasted image at 2016_07_07 03_05 PM.png4206×1770 912 KB

As you can see the packet size is 10x smaller, and there is an ACK being sent for every packet (as opposed to a batch of packets). Also duplicate ACK rate is 3%. Any idea what's causing this?

From what I understand ES is using HTTP chunking, and it look like in the latter case the chunks are much smaller. Is there a way to tune this?

danielmitterdorfer · July 8, 2016, 9:31am

Hi Eugene,

just as a heads up: I try to look more closely into this but it could take a bit of time until I can spare some cycles.

Daniel

eugene_miretsky · July 13, 2016, 1:29pm

@danielmitterdorfer Sure - any help would be appreciated.

Any idea how to enable HTTP chunking and compression in Rally?

danielmitterdorfer · July 14, 2016, 7:01am

Hi Eugene,

do you have the original packet dumps around? Would be great if you could share them with me.

Compression is not supported out of the box the Python Elasticsearch client (i.e. it's not possible without writing custom code) but I have some code lying around that does that (as I needed it for some HTTP compression benchmark in Elasticsearch).

I see what I can do to integrate that into Rally.

Daniel

danielmitterdorfer · July 14, 2016, 12:19pm

Any idea how to enable HTTP chunking and compression in Rally?

If you use the latest master you can specify arbitrary client options (see docs). Otherwise, it will be supported starting with Rally 0.4.0.

For convenience: --client-options="compressed:true,timeout:90,request_timeout:90". (the latter two options are needed as you override the default and I assume you don't want to change these values).

Edit: Expect that query latency will increase with compression (at least it does in my experience). Bulk indexing throughput should be roughly identical.

I tried to support for disabling chunking as hinted by Honza in elasticsearch-py/#422 but it did not work well together with the rest of this feature so it stays unsupported for now. But I am always happy to receive PRs.

Daniel

Topic		Replies	Views
Significant time changes during index benchmarking Elasticsearch	4	417	July 6, 2017
Connection among my ECS infrastructure and my Elastic Cloud Elastic Cloud Enterprise (ECE)	5	490	November 4, 2022
ES: Indexing Rate dropping significantly while Indexing Latency staying about the same Elasticsearch	10	2075	May 13, 2020
Does ES throughput/latency get bottlenecked by the slowest host in the cluster? Elasticsearch	8	645	July 6, 2017
Elasticsearch indexing performance io wait Elasticsearch	1	1864	October 29, 2018

Bluk indexing performance over a high latency newtork

Related topics