Limit for shard size?

klahnakoski · January 25, 2016, 2:49pm

My ES cluster has grown over the past few months. My drives are now 90% full, so I am adding more, while I am on the topic of managing-my-es-cluster I was wondering about maximum shard sizes. The biggest index is reporting 6billion documents, 4T bytes, over 24 shards, each shard approaching 200G.

The 200G shard size makes moving a shard a slow process. When I add a node, it can be over an hour of copying (cluster.routing.allocation.cluster_concurrent_rebalance: 1) before it can take over the query load. I believe smaller shards will replicate faster, allowing the new node to participate sooner. Am I right about this?

How small should the shards be? 20G shards will copy over in 1/10 the time (about 6min), but that would mean 240 shards! That sounds excessive! What are the memory requirements for a node (node.master=true, node.data=false) that must aggregate query results from so many shards?

On the subject of hard limits: I read that a Lucene index has a 2billion document limit. Good, I am far from hitting that. I also read that ES version 2.0+ will put shards on one datapath. Does that mean my datapaths must be larger than the 200G shards?

Thanks

warkolm · January 26, 2016, 1:52am

That's larger than we recommend. We suggest 50GB simply because moving more than that around due to (re)allocation takes a looooong time.

Topic		Replies	Views
Maximum Shard Size in ElasticSearch Elasticsearch	2	19961	July 5, 2017
Sharding in ES Elasticsearch	5	355	June 8, 2018
Signs of too few shards? Number of documents per shard? Elasticsearch	3	533	July 23, 2021
Trying to optimize Elasticsearch cluster Elasticsearch	3	965	February 20, 2017
Index Max Size Elasticsearch	6	31252	July 5, 2017

Limit for shard size?

Related topics