I want to leverage ES for analysis of static data. I was using a subset and it worked as planned. I would be able to query, build graphs in kibana etc. It is quite useful. That being said, since the data is static, Should I index all of the raw data every time i need to boot up my cluster, or can i restore a dump? I think that Search for analysis is the most important so that I can effectively query the data, BUT if it most effective to ingest the data again, I would need to speed that process up. Since I need the full dataset in memory, I need to parse and consume 5Tb of metrics which takes time.
If i create the initial elasticsearch db the way I like it, maintaining all of the data, would it be more ideal to dump the entire thing to disk as a datadump of sorts, or reprocess it each time I need to turn on the cluster?
Ideally I want to adhoc turn on and off the cluster when not needing it for this specific use case to preserve it. I can turn on and off the cluster and it would have all of its data, but if i need to disassemble anything or move hardware, I was not sure if it would be ideal to store the esdb files in 1 HDD for cold storage purposes or the raw data.