We receive approx. 1TB of data everyday(including replica) for 100+ indexes. We currently have 16TB storage spread which stores data for 15 days. Out ES cluster has 2 dedicates MI nodes and 6 Di nodes and 1 MDI node. (M = master D = data I = Ingest)
We would like to optimize our cluster for better storage (of 30 days) and index/search performance.
- Should we have dedicated master nodes(with no ingestion)?
- Should we add client nodes for query purposes?
- Can we use best_compression for storage? how would it affect my search queries?
- We create a snapshot of the current index state every day. In case we plan to restore index for any day which is around 150 Gb, it takes approx 2 hours to restore. Is there a faster way to restore indexes from the snapshot?