What are the requirements?

Christian_Dahlqvist · October 5, 2024, 5:58pm

No, not within a single node. That is where the tiered architecture I described comes in. Did you read the blog post I linked to? That one is a bit old but you can easily find others through Google if you want a different take on this common architecture.

That is the purpose f the architecture, but the node counts you specified sounds very low given the estimated data volumes. You can technically store a lot of data on a warm node with HDD, but be aware that the more data you store the slower queries will likely be.

dsagent · October 5, 2024, 6:11pm

Yes, I read it.
But what I meant is that even if I make the mentioned layers, I cannot transfer data from the SSD disk to the HDD after 14 days because I did not find anything to enter a value for the path of the other disk

dsagent · October 5, 2024, 6:16pm

What is the number of nodes suitable for this operation?
And what is the best scenario in your opinion?
Because if something happens mistakes or something like that it will be bad

Christian_Dahlqvist · October 5, 2024, 6:20pm

Nodes belonging to different tiers will be labelled and ILM (see other link I provided) is the process that will move indices between tiers based on configured policies.

I the data volume is accurate and you want to keep it for a full year I would recommend reaching out to Elastic for help with sizing as searchable snapshots will save you a lot of money on hardware and they can better help you size this use case.

dsagent · October 5, 2024, 6:24pm

Ok, I got it.

dsagent · October 5, 2024, 6:27pm

Ok
Thank you very much
I will do that, but I don't think it's possible because the data is local.

dsagent · October 5, 2024, 6:29pm

I have another question
What about backing up if I do a snapshot can I take it and restore it in another set?

Christian_Dahlqvist · October 5, 2024, 6:37pm

You should be able to use searchable snapshots on premise as well as it is possible to shote snapshots on shared filesystem.

If you can not fit the full data set in the cluster but still need access to parts of it you can store old data in a snapshot repository and restore old indices when needed. This will naturally take time and requires you to have enough spare capacity for the restore.

dsagent · October 5, 2024, 6:41pm

Yes, that's for sure. What I mean is that I made a Snapshot, for example, but there was a problem and I want to restore this shot from a group other than the group that I made the shot from

dsagent · October 5, 2024, 6:42pm

Yes, that's what I mean.

Christian_Dahlqvist · October 5, 2024, 7:05pm

I am not sure I understand. Elasticsearch snapshots are taken from the cluster as a whole and you can change settings when you restore.

dsagent · October 5, 2024, 7:30pm

On what basis is the number of nodes determined?

dsagent · October 5, 2024, 7:35pm

Of course I understand you.

I will explain more, for example, if I had a group and there was data in it and I made a snapshot, then there was a problem in the current group and I want to reproduce the shot except another group and restore it in the new group or to say I wanted to do some analysis and I wanted to restore the shot in a different group than the main group, can it be restored in the new group?

Christian_Dahlqvist · October 5, 2024, 7:52pm

I do not understand what you mean by group.

dsagent · October 5, 2024, 8:05pm

Cluster

dsagent · October 5, 2024, 8:07pm

Can you explain to me?

Christian_Dahlqvist · October 5, 2024, 8:51pm

All these tiers would be subsets of nodes within a single cluster. The snapshots are taken cluster-wide.

For logging and metrics use cases it is often driven by storage. The expected number of concurrent queries and acceptable query latencies tend together with the type of storage used drive how much data each node can hold. This will vary by tier and you most likely need to test with your data and hardware in order to find out what works for your particular use case.

dsagent · October 7, 2024, 5:52pm

What is the maximum amount of space that one node can afford ?

Christian_Dahlqvist · October 7, 2024, 6:38pm

It will depend on the specification of the node, the type of storage used and whether it performs indexing or not. The more data you put on a node, the slower queries will generally be. This is why you need to test it with your data, hardware and requirements around query latencies.

dsagent · October 7, 2024, 6:49pm

ok
thank you

Topic		Replies	Views
Elasticsearch hardware requirement,and benchmarking Elasticsearch	10	2543	July 6, 2017
Elastic Search configuration Elasticsearch	12	645	July 6, 2017
ES Nodes storage capacity Elasticsearch	4	1140	July 6, 2017
Master Node vs. Data Node Architecture Elasticsearch	7	11358	July 6, 2017
What is the best way to distribute nodes and shards in Elasticsearch to achieve fast search while storing recent data on SSD and older data on HDD? Elasticsearch	27	163	August 17, 2025

What are the requirements?

Related topics