Correct number of shards for 5.3 TB indices

mario12 · April 19, 2017, 5:47pm

We have indices with total size of 5.3 TB. We have one replica for this indices. So, in total 10.6 TB. We are planning to start re-index process to make the shards smaller (currently 1TB per shard). We are planning to create new indices with 200 shards. This will make the shard size 27 GB (approximately). We have 33 node cluster and few small indices share this cluster. Please let me know if there is any disadvantages of having 200 shards.

warkolm · April 19, 2017, 11:58pm

That's a reasonable size. Why is the index so large though?

mario12 · April 20, 2017, 12:27am

Because, this indices is not time partitioned. We are planning to implement it in future.

I read that there is overhead in having more shards. Will having 200 shards (primary) + 200 (replica) for total no. of 33 nodes affect the performance negatively?

We are at 1.7 (java) and 1.7.2 (elasticsearch).

warkolm · April 20, 2017, 12:31am

You'd have to test it to be sure. But given it's only 12 shards per node, I wouldn't imagine a big problem.

mario12 · April 20, 2017, 12:35am

I will try to get this tested. What if we have 60 shards (p) + 60 shards (r)? This will reduce the no of shards per node ratio to 1:4. Is there any advantage of this proposal to 200 + 200 (we discussed it earlier)?

warkolm · April 20, 2017, 12:43am

Generally, more shards = higher indexing speeds, less shards = faster query speeds.

But it's all stuff you need to test as more/less shards is relative.

mario12 · April 20, 2017, 12:46am

ok. Thanks for your help on this issue.

mario12 · April 20, 2017, 12:48am

Is there a quick way find out if an indices is bound by read (query) or write (indexing)?

warkolm · April 20, 2017, 1:18am

Use some kind of monitoring.

mario12 · April 20, 2017, 1:51am

Thanks again.

system · May 18, 2017, 1:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trying to optimize Elasticsearch cluster Elasticsearch	3	980	February 20, 2017
Large shard size Elasticsearch	4	416	December 4, 2021
Proper shard and replica settings Elasticsearch	6	136	August 5, 2024
Shard size / Index number / server count and performance Elasticsearch	4	1391	July 6, 2017
Number of indices / shards in a cluster Elasticsearch	3	424	November 16, 2021

Correct number of shards for 5.3 TB indices

Related topics