The shards are crowded in a node


(Unho Yeo) #1

Hi :slight_smile:

We're using Elasticsearch for almost our products and we found a strange issue on our Elasticsearch cluster.
The problem is that the shards are crowded in a node for a couple of indexes.

Issue Image

Please check the above image.

Elasticsearch Information

  • Elasticsearch version : 5.5
  • OS version : CentOS 6.9
  • 3 Master Nodes
  • 7 Hot Data Nodes
    • Shards configuration : 24
    • Replica configuration : 0

Changes

We've used that 5 data nodes and then recently, we've added 2 data nodes in the clusters.
After then, the shards are crowded in a 6th node.

Reproduce Issue

I can't say that how to reproduce this issue because other indexes are looking good.
Obviously, if we created new indexes then it could occur this issue.

Workaround

I believe we can resolve this issue if we can relocate the shards manually.
But I worried about occur again after doing that.

Questions

So, my questions are,

  1. Is this a reasonable issue if the additional data nodes are added in exists cluster?
  2. Is Elasticsearch performance still good if the shards are crowded in a single node?
  3. If this topic could be an issue, how do we fix it?

Thanks for your help!
Best regards,
Unho.


(David Pilato) #2

Is the available disk space lower on node 1 than node 2?


(andy_zhou) #3

see the disk /rebanlce /shards about pre node.


(Mark Walkom) #4

Why do you have so many primary shards?


(Unho Yeo) #5

Hi David, thanks for your help.

The crowded node is node6 and the disk size of node6 is lower than other.
You can see the shard information like below.

node  shards disk.indices disk.percent
hot-1   1487        581gb           76
hot-2   1487      548.1gb           77
hot-3   1487      557.2gb           43
hot-4   1487      557.8gb           78
hot-5   1487      554.9gb           80
hot-6   1416      462.9gb           89
hot-7   1487      571.5gb           76

(Unho Yeo) #6

The index size is too large. So we determined to split data using shard.


(Unho Yeo) #7

Could you see the below information?
I'm not sure "the disk /rebalance" word.

node shards disk.indices disk.percent
hot-1 1487 581gb 76
hot-2 1487 548.1gb 77
hot-3 1487 557.2gb 43
hot-4 1487 557.8gb 78
hot-5 1487 554.9gb 80
hot-6 1416 462.9gb 89
hot-7 1487 571.5gb 76


(David Pilato) #8

It's probably caused by https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html

?

cluster.routing.allocation.disk.watermark.low
Controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.


(Mark Walkom) #9

You have too many shards, you could easily double the size of the current shards.


(andy_zhou) #10

the disks space of per node not same?


(Unho Yeo) #11

Hm... If you theory is correct, then the 6th nodes should be not allocated new shards, but it's not.
But it's useful information for me. I didn't know this option, so that would be usable future version.
Thanks a lot!


(Unho Yeo) #12

Yeah, as you know that, we have too many shards.
But disk capacity is increasing gradually because new documents are stored.

We cannot expect how much data will be stored.
Do you have any strategy on that?

I think we can split each shards by data size, but ES doesn't support on that.
So, we've determined that the shards size should be 24 ~ 40.

Do you have any recommendation?
Best regards,
Unho.


(andy_zhou) #13

as i know it support. see the docs. about elasticsearch.


(Mark Walkom) #14

It does in 6.1 - https://www.elastic.co/guide/en/elasticsearch/reference/6.1/indices-split-index.html


(Unho Yeo) #15

I think the docs of as your mentioned are shared by Mark, right?
I will review that and then share to my team. Thanks for all of your helps!

Best regards,
Unho.


(Unho Yeo) #16

Your information is so useful. Thank you so much, Mark!

It seems like our ES has to be upgraded to 6.1 version.(We're using 5.5 ES version)
We need to make a plan to upgrade to use that.

I will share this document with my team!
Have a great day!

Best regards,
Unho.


(David Pilato) #17

Adding also this presentation in case it helps:


(andy_zhou) #18

with 5.5 as i know can rebuild index with mapping also can slove it
update to 6.1 need so many test .
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/indices-shrink-index.html


(Unho Yeo) #19

Actually, we manage the indices by date. Even though, we made overshard.
Thanks for sharing the document.

I've learned good technics from you guys, I believe that would be nice to my team and me.
Thanks again.


(Unho Yeo) #20

Yes, as you told me, we gonna upgrade carefully to our Elasticsearch, thanks for your advice!