I've looked online for answers to this but can't find anything that tells me definitively. I'm looking for an explanation of what
"number_of_routing_shards": 30
means. I'm reducing the default template to 1 shard (we won't have huge amounts of data) but I'd like to know what the routing shards relate to before I commit the change.
It's related to index splitting. For more details see the Elasticsearch Split Index docs. In order to allow for index splitting (on an existing shard), one must have this setting set at index creation time. Factorising the value of number_of_routing_shards / shards = a * b * c * ... gives you the potential split factors (a, b, c etc.) you can use in the lifetime of an index.
Apologies for piling on to this thread, but I'm also trying to learn about number_of_routing_shards. Is there any reason not to create an index with this setting? If there's no performance/storage penalty, it seems like one always ought to use it, just to have the option of splitting down the road.
From what I've read since, it's being abandoned in V7.0 anyway (presumably it will be set as a default to allow for shard splitting in the future without the need for you to manually define it). With that in mind, I just disabled it on my template. If you click the link in Steffen's link, you'll see the note at the top of the article about it being dropped in the future.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.