Routing: Equal distribution on all shards

Hi, I have 5 unique routing keys "a","b","c","d","e" in my application.
My ES setup consists of 5 shards.

When i check in current state (ver 6.6.2). My shard allocation is like this:

a	4	
b	2	
c	0	
d	3	
e	4

Note that shard number 1 is unallocated. Shard number 4 is overallocated.

Query
Is there a way that i can ensure all my shards are distributed across 5 shards with one key each ?
I understand there is a murmur hash algorithm which is involved in allocation.

The only way to force one allocation key per shard is to specify the routing when indexing things in Elasticsearch.

The default routing is based on a hash of the document, rather than a specific key.

Thanks for your reply.

I am not using the default routing, while indexing i do specify eg: routing=a.

My problem is when i finish indexing for all routing keys i have. Shard 1 remains unallocated.
I need a way such that a,b,c,d,e, (having 5 shards in ES), documents for each routing key goes on separate shard.

I understand that it totally depends on the hash generate for the given keys.

My base problem:
In above example say routing=a and routing=e goes on same shard i.e. 4.
Assume a and e are types of documents, which have few fields in common. When i query documents with routing=a and on common fields, ES also checks all documents in routing=e lowering the performance extensively.
Check this thread

Some issue which i see with shard allocation by routing
If i have 50shards and have 100 routing keys, it may happen that due to the nature of hash generated, some shards will remain unallocated for ever.

Suggestion:
There could be a routing key to shard number mapping in the settings section which will give this control to the user. Such that user will decide which shard the routing key maps to.

At the moment the routing key is hashed instead of the id. If you have large number of routing keys probability will give a reasonably even distribution. If you need more control and have few routing keys instead consider using multiple separate indices instead.

I already have 200 indexes, and the choosing of index based upon some factors is already manually handled in my code.
Each index, inside it is having unique types, which i call the routing keys, which i wanted to distribute equally across shards (my original question).

This is like Types (routing keys) inside Types (my indexes).

I wonder why the control of choosing shard when routing is given, is not handed to user. I think this would add more flexibility in ES and better control to users.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.