Java BulkAPI Slowness

I'm seeing occasional pauses/hangs that last a few seconds each when sending data to Elastic via the Java BulkAPI. They are happening frequently, sometimes every few seconds. I'm not sending nearly enough data that I would imagine the elastic nodes to hang they way they are. I'm sending at most 480 documents at once spread over 12 different Java threads. The documents are small too, only a few KB each. The end goal being to send tens of thousands of documents via looping in pieces.

Pseudo code looks like this, again spread across 3 different application servers each with 4 threads performing this same chunk of code at the same time...

BulkRequestBuilder bulkRequestBuilder = client.prepareBulk();
for (...) {
    IndexRequestBuilder indexRequest = // create new index request builder

Is there anything in my configuration that looks wrong or might account for the hiccups? I would expect these calls to only take a few milliseconds each, and often they do, but then other times with the pauses they all get stuck together and might freeze for 5-10 seconds, at which point they all finish together.

Profiling with JVisualVM I see that Elastic is spending a lot of time stuck in BaseFuture$Sync.get() during the pauses.

Thanks for any help!

Webapp Environment: 3 tomcat nodes, each with 4 worker threads sending data (so 12 total)
Elastic Environment: 3 node cluster (all nodes master/data eligible), each with 8GB RAM and 4 CPUs
Elastic Version: 5.0.2

You should give a try to the BulkProcessor.

The problem here is that when you execute the bulk it is performed in the same Thread. So next iteration has to wait I believe.

I have seen no issue with BulkProcessor. It executes the bulk in parallel.

The call to .get() blocks though, right? After it completes I process the results of each one of the index requests added to the BulkRequestBuilder. So the next time the thread comes back around to this loop the previous request should be long cleared out and the thread in a good state to run again, correct?

I'm wondering if simply staggering the calls a bit (with a small thread sleep delay) so they don't all bombard into elastic at the same time will help.

Should have noted this occurs during a heavy indexing time during which I set replicas to 0 and index refresh rate to -1. Are there any other settings I should consider changing to speed up indexing or help with the contention I'm seeing? Thanks again!

I'd first give a try to BulkProcessor before trying to fix issues that you might not have to fix otherwise. Just a thought.

1 Like

Bulk Processor looks promising for future use perhaps, but doesn't quite fit our current processing workflow.

I did however have some luck in setting the index.translog.durability to async during our heavy indexing time, which seemed to remove most (if not all) of the lock contention I was seeing.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.