Requests Containing an Array of Custom Routing Values

Chinmaya_Poswalia · December 5, 2018, 12:36am

Hi Guys,
We have a cluster partitioned by a certain key for which we are using custom routing. We now are having a use-case where we have to request data across different key with the key list to be large. So wanted to understand how is the custom routing query processed? I know when we specify a list of routing keys, elastic search goes to all the shards mapping those sets of routing keys and fetches the data. I have a couple of questions around that

Lets say the custom routing parameter array contains 1000 values. There are only 10 shards though. Will there be 1000 request internally with one for each routing or at max there will be 10 request with one request per shard.
When returning data with query containing multiple routing values, does elasticsearch paginate the requests (sort request across all partitions) and return the requested page or it returns data partitioned by routing key.

warkolm · December 5, 2018, 12:41am

We aren't all guys

Requests are sent per shard. So you essentially end up with one bulk request with 100 individual requests per shard.

Elasticsearch will just look at it as one response, irrespective of the routing. I don't know if there is a way to change that though.

Chinmaya_Poswalia · December 5, 2018, 12:57am

Thanks @warkolm for your response.

My sincere apologies

So in the above example assuming that the expected response to be top 25 records

The co-ordinating shard will query all the 10 shards with each one returning 25 records.
Then the co-ordinating shard will sort the 250 records and return the top 25 of them.
Is the above flow correct?

warkolm · December 5, 2018, 1:13am

Yes, it's the same as any other query.

Chinmaya_Poswalia · December 5, 2018, 2:49am

Is there a way to specify in the query to tell final shard to query and filter for only the list of values that it was resolved to based on the custom routing provided.

For example, lets say the cluster is partitioned by attribute A which is also present in the document.

We have a query to get the top 25 documents where document's attribute A value should be among (A1, A2, .... A50).
We have 10 shards such that shard S1 contains A1.. A5 and shard S2 contains A6 to A10 and so on.

Is there a way to query such that

Shard S1 queries for A1 to A5 in its index and not from A6 to A50. Similarly S2 queries only for A6 to A10 and so on.
This way shard S1 doesn't need to processes values that it doesn't contain and serves the query faster.

warkolm · December 5, 2018, 3:08am

If you want to do that then you should use filters, that way it will use bitsets which are super efficient.

Chinmaya_Poswalia · December 5, 2018, 3:54am

Did you mean Bool Query | Elasticsearch Guide [6.5] | Elastic
or Terms query | Elasticsearch Guide [8.11] | Elastic

Couldn't find any documentation stating terms supported under filter?

system · January 2, 2019, 3:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Custom routing above shard count Elasticsearch	2	682	July 5, 2017
Custom routing + shard over-allocation Elasticsearch	1	375	July 6, 2017
Custom routing and index aliases Elasticsearch	3	2590	July 6, 2017
Custom routing and multiple indexes where shard distribution is uniform Elasticsearch ccs-cross-cluster-search	3	513	August 28, 2019
Routing to an Index Partition - Custom Routing Elasticsearch	1	259	October 13, 2021

Requests Containing an Array of Custom Routing Values

Related topics