Deadlock in Bulk Ingester using Elastic 8.15.3 in Java

Hello,

I'm experiencing a deadlock while using the bulk ingester, specifically during the flush operation. Several threads are blocked, each waiting with the following stack trace:

java.base/jdk.internal.misc.Unsafe.park(Native Method)
java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1660)
co.elastic.clients.elasticsearch._helpers.bulk.FnCondition.whenReadyIf(FnCondition.java:82)
co.elastic.clients.elasticsearch._helpers.bulk.BulkIngester.flush(BulkIngester.java:276)

Could you please assist in identifying what might be causing this issue and suggest any potential workarounds?

More info about the problem: maxConcurrentRequests equals to 1 and about 30 threads got stuck.

Thanks for your help!

1 Like

Hello and welcome!

The BulkIngester had a known deadlock problem, until it was mostly solved in 8.15.2; "mostly" because there is still a user that is experiencing the issue even after the updates, but we couldn't reproduce it using the exact same code, so we concluded that it probably is something machine/OS related that we don't have control over.

You can read the original issue on our github, in particular reading the comments to understand whether your problem could be solved by tweaking the BulkIngester configuration or changing the Listener implementation.

1 Like

I just saw the edit with the additional information, so maxConcurrentRequests is a parameter that indicated how many requests will be sent in parallel to the elasticsearch server, and having just 1 request going out at a time with 30 adding threads will surely bottleneck the BulkIngester. Since even smaller deployments of the elasticsearch server can easily handle multiple requests at once, I suggest trying to increase maxConcurrentRequests to 10 and trying again.

1 Like

Thanks! I upgraded the Elastic image from 8.12.1 to 8.15.3, and it resolved the issue.

1 Like