Hi Guys,
We have a cluster partitioned by a certain key for which we are using custom routing. We now are having a use-case where we have to request data across different key with the key list to be large. So wanted to understand how is the custom routing query processed? I know when we specify a list of routing keys, elastic search goes to all the shards mapping those sets of routing keys and fetches the data. I have a couple of questions around that
Lets say the custom routing parameter array contains 1000 values. There are only 10 shards though. Will there be 1000 request internally with one for each routing or at max there will be 10 request with one request per shard.
When returning data with query containing multiple routing values, does elasticsearch paginate the requests (sort request across all partitions) and return the requested page or it returns data partitioned by routing key.
Is there a way to specify in the query to tell final shard to query and filter for only the list of values that it was resolved to based on the custom routing provided.
For example, lets say the cluster is partitioned by attribute A which is also present in the document.
We have a query to get the top 25 documents where document's attribute A value should be among (A1, A2, .... A50).
We have 10 shards such that shard S1 contains A1.. A5 and shard S2 contains A6 to A10 and so on.
Is there a way to query such that
Shard S1 queries for A1 to A5 in its index and not from A6 to A50. Similarly S2 queries only for A6 to A10 and so on.
This way shard S1 doesn't need to processes values that it doesn't contain and serves the query faster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.