Strict routing: only documents with one routing key per shard

trobby · October 18, 2017, 9:45am

We're currently optimzing the sharding setup of our Elasticsearch index to (surprise) decrease response times. Currently the amount of routing keys is equal to the amount of shards. We're looking for a setup, where all documents in a shard are of one routing key only. Currently the distribution over the shards is very uneven. Some shards are even empty.

This is how it is at the moment and how it should look like

Current

shard:0 -> routes:bmx, cyclocrosser
shard:1 -> routes: track-bike
shard:2 -> routes: shard:3 -> routes: downhill

Wanted

shard:0 -> routes:bmx
shard:1 -> routes: track-bike
shard:2 -> routes: cyclocrosser
shard:3 -> routes: downhill

Is there any possibility to make sure, that one routing key will be routed only to one shard?

We know that the routing is based on djb2 / http://www.cse.yorku.ca/~oz/hash.html#djb2. Is there any option to influence this behavior and can someone offer deeper insights, how the routing works internally.

Igor_Motov · October 22, 2017, 12:15pm

Why not just create 4 separate indices and use index name instead of routing?

trobby · November 10, 2017, 12:54pm

Thanks for the reply. You're right it's the only way to achieve this in proper and save way.

To summarize the outcome: It's not possible.

Why? To work for the most use cases the routing is not directly based on the routing keys since the distribution of the documents might end up in a very unequal manner, if the distribution of routing key is like that (not for my case but in general it might be). The hashing of the routing key achieves this and even the disappearance of document having a certain routing will not end up in an empty shard.

You can create a workaround based on the knowledge of the used hashing function (Murmur) but this might break, if the Elasticsearch teams decides to changes the hashing function. And this happened already, so it's not save to rely on such a hidden feature.

The only way to achieve this is by creating a single index for each routing key as pointed out by Igor_Motov.

As well see: https://stackoverflow.com/questions/46808084/elasticsearch-routing-only-documents-with-one-routing-key-per-shard/47223244#47223244

system · December 8, 2017, 12:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Routing: Equal distribution on all shards Elasticsearch	5	624	June 28, 2019
Question about shard routing Elasticsearch	2	291	July 6, 2017
[SOLVED] Customing document routing Elasticsearch	7	795	July 5, 2017
Make different routing key to isolated shard Elasticsearch	5	791	December 23, 2016
Question about routing Elasticsearch	2	278	July 6, 2017

Strict routing: only documents with one routing key per shard

Related topics