The shards are crowded in a node

mcun0s · December 19, 2017, 2:36am

Hi

We're using Elasticsearch for almost our products and we found a strange issue on our Elasticsearch cluster.
The problem is that the shards are crowded in a node for a couple of indexes.

Issue Image

Please check the above image.

Elasticsearch Information

Elasticsearch version : 5.5
OS version : CentOS 6.9
3 Master Nodes
7 Hot Data Nodes
- Shards configuration : 24
- Replica configuration : 0

Changes

We've used that 5 data nodes and then recently, we've added 2 data nodes in the clusters.
After then, the shards are crowded in a 6th node.

Reproduce Issue

I can't say that how to reproduce this issue because other indexes are looking good.
Obviously, if we created new indexes then it could occur this issue.

Workaround

I believe we can resolve this issue if we can relocate the shards manually.
But I worried about occur again after doing that.

Questions

So, my questions are,

Is this a reasonable issue if the additional data nodes are added in exists cluster?
Is Elasticsearch performance still good if the shards are crowded in a single node?
If this topic could be an issue, how do we fix it?

Thanks for your help!
Best regards,
Unho.

dadoonet · December 19, 2017, 6:03am

Is the available disk space lower on node 1 than node 2?

zqc0512 · December 19, 2017, 6:25am

see the disk /rebanlce /shards about pre node.

warkolm · December 19, 2017, 7:46am

Why do you have so many primary shards?

mcun0s · December 19, 2017, 8:51am

Hi David, thanks for your help.

The crowded node is node6 and the disk size of node6 is lower than other.
You can see the shard information like below.

node  shards disk.indices disk.percent
hot-1   1487        581gb           76
hot-2   1487      548.1gb           77
hot-3   1487      557.2gb           43
hot-4   1487      557.8gb           78
hot-5   1487      554.9gb           80
hot-6   1416      462.9gb           89
hot-7   1487      571.5gb           76

mcun0s · December 19, 2017, 8:52am

The index size is too large. So we determined to split data using shard.

mcun0s · December 19, 2017, 8:59am

Could you see the below information?
I'm not sure "the disk /rebalance" word.

node shards disk.indices disk.percent
hot-1 1487 581gb 76
hot-2 1487 548.1gb 77
hot-3 1487 557.2gb 43
hot-4 1487 557.8gb 78
hot-5 1487 554.9gb 80
hot-6 1416 462.9gb 89
hot-7 1487 571.5gb 76

dadoonet · December 19, 2017, 9:04am

It's probably caused by Disk-based shard allocation | Elasticsearch Guide [8.11] | Elastic

?

cluster.routing.allocation.disk.watermark.low
Controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.

warkolm · December 19, 2017, 9:08am

You have too many shards, you could easily double the size of the current shards.

zqc0512 · December 20, 2017, 12:33am

the disks space of per node not same?

mcun0s · December 21, 2017, 2:03am

Hm... If you theory is correct, then the 6th nodes should be not allocated new shards, but it's not.
But it's useful information for me. I didn't know this option, so that would be usable future version.
Thanks a lot!

mcun0s · December 21, 2017, 2:09am

Yeah, as you know that, we have too many shards.
But disk capacity is increasing gradually because new documents are stored.

We cannot expect how much data will be stored.
Do you have any strategy on that?

I think we can split each shards by data size, but ES doesn't support on that.
So, we've determined that the shards size should be 24 ~ 40.

Do you have any recommendation?
Best regards,
Unho.

zqc0512 · December 21, 2017, 2:58am

as i know it support. see the docs. about elasticsearch.

warkolm · December 21, 2017, 3:25am

It does in 6.1 - Split Index | Elasticsearch Reference [6.1] | Elastic

mcun0s · December 21, 2017, 7:48am

I think the docs of as your mentioned are shared by Mark, right?
I will review that and then share to my team. Thanks for all of your helps!

Best regards,
Unho.

mcun0s · December 21, 2017, 7:52am

Your information is so useful. Thank you so much, Mark!

It seems like our ES has to be upgraded to 6.1 version.(We're using 5.5 ES version)
We need to make a plan to upgrade to use that.

I will share this document with my team!
Have a great day!

Best regards,
Unho.

dadoonet · December 21, 2017, 8:18am

Adding also this presentation in case it helps:

zqc0512 · December 21, 2017, 8:20am

with 5.5 as i know can rebuild index with mapping also can slove it
update to 6.1 need so many test .
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/indices-shrink-index.html

mcun0s · December 22, 2017, 12:57am

Actually, we manage the indices by date. Even though, we made overshard.
Thanks for sharing the document.

I've learned good technics from you guys, I believe that would be nice to my team and me.
Thanks again.

mcun0s · December 22, 2017, 12:58am

Yes, as you told me, we gonna upgrade carefully to our Elasticsearch, thanks for your advice!