No, not within a single node. That is where the tiered architecture I described comes in. Did you read the blog post I linked to? That one is a bit old but you can easily find others through Google if you want a different take on this common architecture.
That is the purpose f the architecture, but the node counts you specified sounds very low given the estimated data volumes. You can technically store a lot of data on a warm node with HDD, but be aware that the more data you store the slower queries will likely be.
Yes, I read it.
But what I meant is that even if I make the mentioned layers, I cannot transfer data from the SSD disk to the HDD after 14 days because I did not find anything to enter a value for the path of the other disk
What is the number of nodes suitable for this operation?
And what is the best scenario in your opinion?
Because if something happens mistakes or something like that it will be bad
Nodes belonging to different tiers will be labelled and ILM (see other link I provided) is the process that will move indices between tiers based on configured policies.
I the data volume is accurate and you want to keep it for a full year I would recommend reaching out to Elastic for help with sizing as searchable snapshots will save you a lot of money on hardware and they can better help you size this use case.
You should be able to use searchable snapshots on premise as well as it is possible to shote snapshots on shared filesystem.
If you can not fit the full data set in the cluster but still need access to parts of it you can store old data in a snapshot repository and restore old indices when needed. This will naturally take time and requires you to have enough spare capacity for the restore.
Yes, that's for sure. What I mean is that I made a Snapshot, for example, but there was a problem and I want to restore this shot from a group other than the group that I made the shot from
I will explain more, for example, if I had a group and there was data in it and I made a snapshot, then there was a problem in the current group and I want to reproduce the shot except another group and restore it in the new group or to say I wanted to do some analysis and I wanted to restore the shot in a different group than the main group, can it be restored in the new group?
All these tiers would be subsets of nodes within a single cluster. The snapshots are taken cluster-wide.
For logging and metrics use cases it is often driven by storage. The expected number of concurrent queries and acceptable query latencies tend together with the type of storage used drive how much data each node can hold. This will vary by tier and you most likely need to test with your data and hardware in order to find out what works for your particular use case.
It will depend on the specification of the node, the type of storage used and whether it performs indexing or not. The more data you put on a node, the slower queries will generally be. This is why you need to test it with your data, hardware and requirements around query latencies.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.