What is the recommended memory:data ratio for a cold zone?

I see for the hot zone it's 30.
For a warm zone it's 160.
I haven't seen a value for cold zone.

And how exactly is that calculated ?

Thank you.

Hi @calin

First I would ask how you intend to use the Cold Tier?

Do you have a commercial license or Basic / Free License?

Commerical Enterprise license supports Searchable Snapshots which is "The magic" behind the cold tier in Elastic Cloud.

Cold on a Basic License is really just Warm with no Replicas... which can be good for less cost / storage but risk of downtime or lost data.

But in short, in Elastic Cloud the Warm and Cold Node profiles are the same as Warm 160:1

Where does this come from?.. it comes from running 1000s of clusters across 100s of use cases .. i.e. experience.

That said, that does not mean that ratio is perfect for you but it should be a very good starting point...

So tell us, what are you trying to accomplish?

What kind of Daily and Long Term Data Volumes?

What is your tolerance of Risk of Downtime or Data Loss?

1 Like

Which version of Elasticsearch are you using?

For the moment it's a free license, but I can't afford losing data, so I guess instead of cold I should have warm instead, if I can't get a paid license ? (that's not up to me)

The cluster ingests over 100GB data/day, retention of 1 year.
There has to be no data loss.

Related questions, if you don't mind:

Right now we have:

  • 3 master nodes, each 2 CPU, 2GB RAM
  • 12 data nodes (2 hot, 5 warm, 5 cold) with the hot ones 8 CPU, 64 GB and the other ones 2 CPU 64 GB
  • 3 ingest 2 CPU 4 GB
  • 3 coordinating 2 CPU 4 GB
  1. do you see any glaring issue with that ?
  2. would it work to have the hot and warm data nodes as dedicated data notes, and have the cold data nodes also be able to handle ingestion & coordination ? Is that possible ? Is that a good idea ?

Thank you in advance.

What version of Elasticsearch before we answer

Also do you have a snapshot repository? Are you taking/ storing snapshots?

For the moment it's the free one, but a license is being considered.

I don't know if we're storing snapshots, I'm new in the project. I can ask around, see if I can find this information.

Snapshot/Restore are supported with Free / Basic version.

We are asking for the version number because there have been some changes over the years to the features

I will answer this quickly that is not generally considered best practice... not to say you can't but that would be un unusual configuration.

Is that 100GB represent both the Primary and Replica per day or just the Primary?

Can you provide a little more context of what is your use case and what you want to achieve?

For example, what would be the difference between a warm and cold node for your use case?

What are the hardware differences between the warm and cold nodes? Also, what are the hardware specs for all your kind of nodes in terms of disk type? You didn't share the disk type and.

Normally a hot-warm architecture is enough and when you have an enterprise license you may use the frozen tier which uses searchable snapshots.

1 Like

100 GB per day is the amount of log data.

If I'm not mistaken, the version is 8.11.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.