Sharding hot vs warm Nodes

Hi, hope you all doing good, I will like to see if someone can help me with this.

I have a cluster of 35 nodes running version 7.4.2
3 x of them are master nodes
2 x coordinators
8 x hot
2 x cold
20 x Warm

We get daily indices that are on the TB sizes sometimes and sometimes in the ranges above 20GB and will like to see if someone can suggest me:

  • How I can calculate the amount of shards that could work with this volumes?
  • What factors are taking in consideration to calculate this?, Does the amount of hot nodes matter vs the amount warm nodes or the total amount of data nodes?
  • Is there a way to set a limit on the shard size and if it what happens when is exceed it?
  • When you do a shrink operation, lets say during a lifecycle rotation when we move the indices from hot to warm, how can I calculate the value to shrink, will this matter if let say I have 50 shards and shrink to 1or will be better not shrink and just move them like one to one mapping of the shards, like if the index has 8 shards then move to warm with 8
    shards

Currently Im creating this indices that get a size of 1TB avg with 24 Shards. But Im not sure if thats to little or to much and I got that number but assuming 1 primary shard per hot node + 1 replica per node = 24shards with a primary shard size of 83GB each

Also Not sure if worth to mentioned the amount data disk change between warm and hot
hot has about 12 and warm has like 8 data disk all of them config as JBOD.

Thank you

Nice summary. Here is a related discussion with several useful resources & guides. Suggest you start there.

Can you explain the "daily indices that are on the TB sizes sometimes and sometimes in the ranges above 20GB" - is that daily data of TB or what is the 20GB part?

More broadly, the guidance seems to be to hold shards to about 50GB, more or less, so your 83GB is on the large side and you probably should aim for less, like half.

Given your sizes, it seems this shard size limit will be your governing factor as long as your ingest & search speeds are acceptable. And you have enough hot/warm nodes that hopefully (not guaranteed) the shards will be spread around, improving performance. Also, simpler is better.

As far as I know there is no way to 'limit' the size - very large shards seem to perform poorly (single-threaded queries), and slow recoveries. However, you can use ILM to rollover the index when it hits a given size, so if your index has 20 shards and reaches 1TB in size, you're at 50GB per shard and can get a new index, etc.

Also watch your overall shard count, i.e. about 20 per GB of heap (maybe a little more in recent versions). People tend to miss this and end up with a thousand shards and stability problems. This & index count often drive people away from daily indexes.

For shrinking, especially for cold storage, I'd think you'd shrink down based on the 50GB target, so if you have a 250GB index with 10 shards, you could shrink to 5 shards for warm/cold storage.

Remember the ILM shrink operation moves all shards to a single node (a sad requirement of the shrink operation); for your 1TB+ indexes, this is a lot of data & movement, so watch cluster performance when this is going on.

Ideally you create shards at the ideal target size for your use case. The number of shards per node then depends on how much data you can store on them and still meed query latency SLAs and indexing requirements. This will depend on requirements, query patterns as well as hardware, especially the performance of storage.

Hot nodes determine how much you can ingest per day while the number of warm and cold nodes determine how long you can store this indexed data. The relative size of the tiers therefore generally depend on the ingestion rate compared to total storage volume.

This is what rollover is used for as it allows you to change to a new index when the ideal size is reached.

Shrinking can be useful in certain scenarios, but the cluster need to do less work if you simply generate indices at the shard size you want long term. Shard size will affect query performance, the time it takes to forcemerge and how long it takes to relocate shards so you will need to test what is optimal for you.

83GB may be fine for your cluster and use case, but as mentioned may be a bit on the large size. The 50GB size is however just a general recommendation and the ideal size will depend on your data, queries and use case.

How much disk space do the nodes in the different tiers have? What type of disk are you suing for the different tiers?

2 Likes

Thank you for your answers.
@Steve_Mushero

Can you explain the "daily indices that are on the TB sizes sometimes and sometimes in the ranges above 20GB" - is that daily data of TB or what is the 20GB part?

Sorry I should be more clear on that topic, the index get from 20GB to up 1.2TB, This is Daily.

Also watch your overall shard count, i.e. about 20 per GB of heap (maybe a little more in recent versions). People tend to miss this and end up with a thousand shards and stability problems. This & index count often drive people away from daily indexes./

You referring to the HEAP configure, or total config RAM per sever.
In this case each of our servers have about 380GB of RAM

@Christian_Dahlqvist

How much disk space do the nodes in the different tiers have? What type of disk are you suing for the different tiers?

both tiers have same size of disk that is 1.8T. SSD drives so
hot 12 x 1.8 = 21.6 TB per server x 8 = ~173TB
Warm 8 x 1.8 = 14.4TB per server 20 = ~288TB
Cold 12 x 3.7 = 44.4 TB per server x 2 = ~88.8TB

Are hot and warm nodes using different types of storage, e.g. SSD vs HDD?

Both hot and warm use ssds

Hi on more thing.

So lets say I have 2TB index = 2000GB / 50GB = 40 shards per index.

Will this 40 shards refers to the primary or takes in consideration replicas as well.

So doing 20 shards with one replica will be 40 shards? or it should be 80 shards, meaning 40 primary plus one replica.

Thanks

2TB index = 2000GB / 50GB = 40 shards per index.
Will this 40 shards refers to the primary or takes in consideration replicas as well.

2TB index = 2000GB / 50GB = 40 primary shards per index.
So with one replica, you will have 80 shards for this index - i.e. 2 copies of all the data.