Hello Elastic Search experts,
If anyone could help to confirm, it will be great,
https://github.com/elastic/elasticsearch/issues/10114?
thanks in advance,
Lin
Hello Elastic Search experts,
If anyone could help to confirm, it will be great,
https://github.com/elastic/elasticsearch/issues/10114?
thanks in advance,
Lin
Yes it has been,
Thanks Mark,
Then what is the new suggested method?
BTW, I want to improve bulk insert performance, and I think asyn replicate is a good solution?
regards,
Lin
Asynchronous replication is not related to performance. You do not need this. It is good that it is deprecated, and it was never active by default. Many confuse this with faster replication or write consistency, but all it is good for is sending early responses of nodes with replica shards, ignoring the answers from them for API evaluation.
It is simpler to disable replicas by setting replication level to 0 at the begin of bulk indexing and increase it after bulk indexing per cluster settings update, plus setting index.refresh.interval
to -1
while bulk indexing. This gives a lot more performance than replication at indexing time.
For best performance, i.e. dynamically matching the power of your servers in the cluster, use Java API BulkProcessor
and tune action number per request and concurrency level. Do not forget to evaluate the bulk request responses, and continue only if bulk requests succeeded.
Thanks Jörg,
regards,
Lin
1 - Yep!
2 - Also yes You need to test to see how many threads and what bulk request size work best for you.
Thank you Mark, have a good weekend.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.