Strict routing: only documents with one routing key per shard

We're currently optimzing the sharding setup of our Elasticsearch index to (surprise) decrease response times. Currently the amount of routing keys is equal to the amount of shards. We're looking for a setup, where all documents in a shard are of one routing key only. Currently the distribution over the shards is very uneven. Some shards are even empty.

This is how it is at the moment and how it should look like

Current

  • shard:0 -> routes:bmx, cyclocrosser
  • shard:1 -> routes: track-bike
  • shard:2 -> routes: shard:3 -> routes: downhill

Wanted

  • shard:0 -> routes:bmx
  • shard:1 -> routes: track-bike
  • shard:2 -> routes: cyclocrosser
  • shard:3 -> routes: downhill

Is there any possibility to make sure, that one routing key will be routed only to one shard?

We know that the routing is based on djb2 / http://www.cse.yorku.ca/~oz/hash.html#djb2. Is there any option to influence this behavior and can someone offer deeper insights, how the routing works internally.

Why not just create 4 separate indices and use index name instead of routing?

2 Likes

Thanks for the reply. You're right it's the only way to achieve this in proper and save way.

To summarize the outcome: It's not possible.

Why? To work for the most use cases the routing is not directly based on the routing keys since the distribution of the documents might end up in a very unequal manner, if the distribution of routing key is like that (not for my case but in general it might be). The hashing of the routing key achieves this and even the disappearance of document having a certain routing will not end up in an empty shard.

You can create a workaround based on the knowledge of the used hashing function (Murmur) but this might break, if the Elasticsearch teams decides to changes the hashing function. And this happened already, so it's not save to rely on such a hidden feature.

The only way to achieve this is by creating a single index for each routing key as pointed out by Igor_Motov.

As well see: https://stackoverflow.com/questions/46808084/elasticsearch-routing-only-documents-with-one-routing-key-per-shard/47223244#47223244

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.