_bulk request with routing parameter

Hunter_Tr_n · November 9, 2021, 8:35am

Currently, I am applying _bulk requests for reindexing. Each _bulk request contains a ton of reindex requests ( creating, updating, or deleting). Each document already has its own routing.

I wonder if I apply routing on _bulk requests, I will get better performance or not.

To do that, I also need to make sure all documents inside the same _bulk request have been stored in the same shard.

If the answer is yes, which hashing function I must use to group message on _bulk request.
Anyone can help to verify this case.

Christian_Dahlqvist · November 9, 2021, 8:52am

If you are indexing immutable data this could improve performance as the batch size going to each shard would likely increase. To make sure you spread the load you could use a timestamp or random number as routing key. I am not sure how much impact this may have though.

Why is your diagram showing bulk requests going through a master node?

Hunter_Tr_n · November 9, 2021, 9:00am

my _bulk request contains multiple reindex requests. Each document already contains its routing. I wonder if I apply routing param for the request, can I reduce performance as transport between ES node or not.

system · December 7, 2021, 9:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk Requests and Routing Elasticsearch	2	343	October 15, 2020
Improve indexing performance speed by routing to a specific shard Elasticsearch	8	210	March 27, 2023
Reindex data after changing default routing to a custom one Elasticsearch	1	718	July 5, 2017
Routing - Massive injection with bulk API Elasticsearch	8	919	July 5, 2017
Search with _routing on indices without _routing Elasticsearch	10	400	May 7, 2020

_bulk request with routing parameter

Related topics