Reduce Bulk rejections

Hi, I am performing load testing on a 150 data node elasticsearch cluster (i3.2x AWS instances)
so the write thread pool size is 8 and queue size is 200.

I am indexing into ES using 20 ecs tasks and each ecs task is using 6 threads to ingest into es using bulk processor with concurrent bulk request #6 .

However, I see bulk rejections from many of the data nodes. and when i captured thread dump from one of the nodes, I see below stack trace for one of the threads in the write pool.

java.lang.Thread.State: BLOCKED (on object monitor)
at org.elasticsearch.index.translog.TranslogWriter.add(
- waiting to lock <0x000000060393e258> (a org.elasticsearch.index.translog.TranslogWriter)
at org.elasticsearch.index.translog.Translog.add(
at org.elasticsearch.index.engine.InternalEngine.index(
at org.elasticsearch.index.shard.IndexShard.index(
at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(
at org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(
at org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(
at org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(
at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(
at$AsyncPrimaryAction$$Lambda$2961/0x00000008017aec40.accept(Unknown Source)
at org.elasticsearch.action.ActionListener$1.onResponse(
at org.elasticsearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$24(
at org.elasticsearch.index.shard.IndexShard$$Lambda$2963/0x00000008017afc40.accept(Unknown Source)
at org.elasticsearch.action.ActionListener$3.onResponse(
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(
at$$Lambda$2201/0x000000080143b040.messageReceived(Unknown Source)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(
at org.elasticsearch.transport.TransportService$8.doRun(
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@14.0.1/
at java.util.concurrent.ThreadPoolExecutor$
Locked ownable synchronizers:
- <0x0000000083347370> (a java.util.concurrent.ThreadPoolExecutor$Worker)
- <0x0000000507f987d0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

looks like one of the write threads has got the lock on TranslogWriter object and other threads are waiting to aquire lock on the same object.

Is there anything that I can do to improve the throughput and reduce bulk rejections?
Does having dedicated ingest nodes help?

Below are the settings that i use
150 data nodes
150 primary shards with 1 replica
index.translog.durability: async
index.translog.flush_threshold_size : 1024MB
elasticsearch version 7.8

Is your data immutable or do you need to update it? How large are the documents on average? How large bulk requests are you using?

@Christian_Dahlqvist ,
Data is not immutable. There are updates as well.
Documents are around 1 kb
bulk actions = 5000
bulk size=1GB
6 concurrent bulk requests.

I think the first thing to try is to upgrade: newer versions have a much more discerning backpressure mechanisms for bulks.


Why such a large cluster?
These days, with things like CCS, we recommend running multiple smaller clusters.

