Possible To Have Multiple Simultaneous Bulk Inserts?


(Kevin M.) #1

I want to run multiple clients that will be doing bulk inserts into the same Elasticsearch index. Is that OK, or do I run the risk of some sort of clash/lock issue(s)? Any info would be appreciated. Thanks.

Kevin


(Mark Walkom) #2

It won't lock, you may get versioning mismatches though.


(Hanish Bansal) #3

@warkolm It's always good to have multiple clients for bulk insert in case of high ingestion rate. But as you mentioned regarding version mismatch issue, there is chance of data loss.

if multiple clients are writing the data parallely and any client got version mismatch error then changes of that client will be lost. Any thoughts to resolve this issue?


(Mark Walkom) #4

There is no loss, the request is notified of the mismatch and it's up to you to do something with it.


(Christian Dahlqvist) #5

This is the normal thing to do and essential to getting optimal indexing throughput. Logstash does use multiple connections in parallel by default. If you are allowing Elasticsearch to assign document ids there should be no clashes nor locking issues. If you however have concurrent updates you might have some contention and conflicting updates, which is usually resolved automatically through retries.


(Kevin M.) #6

Thanks for all of the responses. The processing completed with no issues.

Kevin


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.