I am making a HOT-WARM Elastic Search Design for a logging solution requirement.
My total Storage calculations is: 180 TB for the below requirement
7200 EPS ingestion
350 Bytes / event (average Size)
1x Replica for HA
1x year retention
I will be starting a cluster of 3 Warm-nodes (60 TB Each) for that (including Replica Shards), with the possibility to extend the nodes in the future as per the resource utilization requirements
My 2 questions are:
1- Can I start my Warm-Nodes installation with a smaller disk space, (ex: 20TB), and after a certain time increase the Disk Spaces on the same WARM nodes to reach the 60TB?
(knowing i have a replica configuration on the WARM Nodes as well)
2- Once the data on the WARM nodes are no longer needed to be kept there (Save storage for newer logs), is is possible to move them to a cold / external Storage? If possible, what is the process to access a log/event on theses archived data? (Is there any specific steps to import/activate/search/Deactivate old events)
As explained in this blog post, each shard comes with some amount of overhead in terms of heap usage. Since heap is finite, there is a limit to how much data a node can hold, and this depends on the type of data as well as the mappings. How much data you can store on a node therefore often becomes and exercise in optimising heap usage. You will need to benchmark to see how much data each of your nodes can hold, but in my experience 60TB sounds far too much. I would expect you to require a significantly larger number of warm nodes to handle that data volume.
First check how much a node can handle.
You can use the snapshot API to archive old indices offline.
Also be aware that having very dense nodes for long-term storage can cause a lot of data to need to be redistributed on node failure, which can easily cause problems.
Regarding point 1, even if by doing a shrinking and 1 segment configuration (as per the Hot-Warm best practice), it will still be the same scenario?
Noting that the main requirement of this design is to be able to search old logs (and export the results) based on a Time-range and Source/Destination IP.
Yes, There is always overhead that you need to consider, although this can vary depending on mappings, shard sizes etc. The only thing that removes this is if you close indices, but that also means that they are no longer searchable and Elasticsearch will also not make sure they are replicated in case of node failures.