[Help] Understanding routing to an index partition

Radaeld · January 8, 2022, 8:03pm

Hey all
I have some questions regarding routing_partition_size field.
Not sure but upon reading the documentation , this is what i understood:

By default, using a routing key will access only 1 primary shard, for indexing document and fetching.
When specifying the routing_partition_size, the routing key will now access the number of shards specified in this field.

I tried to test this. I created an index with 6 primary shards only and specified routing_partition_size with a value of 4.

curl -X PUT localhost:9200/my-index-000000 -d '{ "settings": { "number_of_shards": 6 , "number_of_replicas": 0, "index.routing_partition_size": 4}, "mappings": {"_routing": {"required": true}}}' -H "Content-Ty
pe: application/json"

So in theory when indexing documents with a given key it should be distributed among 4 priamry shards, or i belived so. But i tried to index 100k documents and they all went to the same shard

for i in {0..100000}; do curl -X POST localhost:9200/my-index-000000/_doc?routing=key1 -d '{ "data": "something" }' -H "Content-Type: application/json"; done

curl -X GET localhost:9200/_cat/shards?v

my-index-000000                                               2     p      STARTED      0   226b 172.20.0.4 es02
my-index-000000                                               5     p      STARTED      0   226b 172.20.0.4 es02
my-index-000000                                               3     p      STARTED      0   226b 172.20.0.2 es03
my-index-000000                                               1     p      STARTED      0   226b 172.20.0.3 es01
my-index-000000                                               4     p      STARTED      0   226b 172.20.0.3 es01
my-index-000000                                               0     p      STARTED 100001  2.7mb 172.20.0.2 es03

Am i missing something, or did i understood this wrong?

Thanks in advance for the help
And happy new year

warkolm · January 9, 2022, 11:15pm

Welcome to our community!

I don't know this in detail, but my reading of the docs suggest that it'd work like this when indexing;

From your 6 shards, only use 4
Then index into one of those 4
From your 6 shards, only use 4, but as you're looking at the entire 6 shards again, then it's possible that it'll pick ones that weren't previously used
Then index into one of those 4
and so on

Basically, just because you set the partition size to less than the total shard count, doesn't mean the shards that will be picked for indexing will always be the same.

But that's an educated guess based on my use of the product for a while, so I don't know the feature or the code down to a level to be able to state that as actual fact. Hopefully someone that does know can stop in and comment either way so we can both learn

Radaeld · January 11, 2022, 9:34pm

Hey @warkolm
Yup it seems to be as i was thinking.

It seems there is a bug issue going on github. This explains why the 100k documents went all to the same shard

Thanks for the answer and help

system · February 8, 2022, 9:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Routing to an index partition Elasticsearch	2	626	May 31, 2018
Routing to an Index Partition - Custom Routing Elasticsearch	1	262	October 13, 2021
While searching how are shards picked for cluster with index.routing_partition_size Elasticsearch	1	87	April 23, 2024
What are "number_of_routing_shards"? Beats winlogbeat	4	8936	March 20, 2018
Question about shard routing Elasticsearch	2	293	July 6, 2017

[Help] Understanding routing to an index partition

Related topics