I want to run multiple clients that will be doing bulk inserts into the same Elasticsearch index. Is that OK, or do I run the risk of some sort of clash/lock issue(s)? Any info would be appreciated. Thanks.
Kevin
I want to run multiple clients that will be doing bulk inserts into the same Elasticsearch index. Is that OK, or do I run the risk of some sort of clash/lock issue(s)? Any info would be appreciated. Thanks.
Kevin
It won't lock, you may get versioning mismatches though.
@warkolm It's always good to have multiple clients for bulk insert in case of high ingestion rate. But as you mentioned regarding version mismatch issue, there is chance of data loss.
if multiple clients are writing the data parallely and any client got version mismatch error then changes of that client will be lost. Any thoughts to resolve this issue?
There is no loss, the request is notified of the mismatch and it's up to you to do something with it.
This is the normal thing to do and essential to getting optimal indexing throughput. Logstash does use multiple connections in parallel by default. If you are allowing Elasticsearch to assign document ids there should be no clashes nor locking issues. If you however have concurrent updates you might have some contention and conflicting updates, which is usually resolved automatically through retries.
Thanks for all of the responses. The processing completed with no issues.
Kevin
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.