Say I have a cluster of 6 nodes and a single index with 6 shards. I index
documents to this using a routing based on the hour of the day (so 0 to
23). This means I have 24 different routings to 6 shards, with each shard
on a separate node.
So then I want to query this index with a time range both as a filter and
by putting the hours as routings. For example, I can do the following:
curl localhost:9200/_search?routing=09,10,11
Which should limit the document set I have to filter and search through.
This request will be distributed to 1-3 shards.
Straightforward, just like this:
curl localhost:9200/_search?routing=09,10,11,12,13,14
This request will be distributed to 1-6 shards.
However, let's say I use twice as many routings as shards:
curl localhost:9200/_search?routing=09,10,11,12,13,14,15,16,17,18,19,20
This request will also be distributed to 1-6 shards, and ES will report
hitting that many.
However, I've noticed when testing this seems to be slower than the request
that goes to just as many shards but with more routings. Even when making
sure that each routing goes to separate shards for the 6 routing request,
the 12 routing one seems slower.
I'm wondering if using more routings than shards will make multiple queries
to the same shard (although the routing code when querying suggests
otherwise). Or am I missing something here?
Does elasticsearch collapse routings that go to the same shard?
--