Elasticsearch 7.16.2 on M1 Max under Docker (ARM) aggressive throttling while indexing

Hey folks,

I've been struggling a bit with Elasticsearch 7.16.2 running under Docker on my MacBook Pro M1 Max (64 GB RAM, 8 TB SSD).

I have a job running under Ruby on Rails 6.1 with the most recent Elasticsearch Ruby client which is running into index throttling that was not happening on my 2019 Intel iMac (16 core, 128 GB RAM) indexing to an external Thunderbolt 3 SSD. I was always able to run the indexing job to completion without any timeouts on that machine. Not so with the M1 Max.

In particular, on the ARM platform, Elasticsearch seems to be hitting throttling limits and eventually generating Faraday read timeouts after bulk indexing between 2 and 3 million of the 4.5 million documents. The documents themselves consist of about seven keyword fields and a single geoshape field containing a polygon representing the boundary of a Census Bureau block.

I am currently working around this problem by pausing the indexing for about ten minutes after each batch of 250,000 documents. Not a very elegant solution, but it works. The Rake task is just using the Ruby Elasticsearch client to connect to the local Elasticsearch nodes and throwing the documents at the cluster with the bulk API, 1000 documents at a time.

Docker Desktop 4.3.2 (build 72729) is configured to use 8 CPUs, 4 GB swap, 32 GB of RAM; each of three Elasticsearch 7.16.2 nodes has a heap of 8 GB RAM. Three local directories are bind-mounted as data volumes for the Elasticsearch nodes, so disk I/O shouldn't be a problem. BlackMagic Disk Speed Test reports 5,500 MB/sec writes even as I'm running my import job. I don't know what the IOPS rating is for this SSD but I suspect it's at least as fast as the 1 TB Intel Thunderbolt 3 SSD I have on my iMac. I haven't tweaked the dedicated index write buffer size.

I do dedicate quite a lot more RAM to the nodes on the Intel machine, mostly because I have it. I suppose that could account for the indexing performance difference. But I would also like to know if anyone else is seeing any odd behavior of Elasticsearch on the M1 Macs just in case.

It'd be helpful if you could share the timeouts, your Elasticsearch config and logs, your docker container config, and hot threads from Elasticsearch.

Fortunately, this did turn out to be just a failure on my part to reserve lots of RAM on these smaller nodes for indexing. I set indices.memory.index_buffer_size=2G on each node and restarted them; the next time they were able to complete the indexing of all 4.6 million documents without any pauses or timeouts.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.