Elasticsearch: What's the best way to store big-data cost effective

We are using Basic Elasticsearch v7.4 on a single node with nearly 2TB of data. We planning to increase our retention however we are constrained by it's storage capacity. While adding disks and using multiple data path is a choice is not a recommended one or else needs to use LVM which I found quite troublesome.

We are considering Data Tier and adding new nodes in different tier like 'cold' or 'frozen'. Is this the best method to achieve this ? I have tested cold tier but does this version of ES support frozen tier? We are not expecting the same search performance for very old data as recently created indices do. Is HDDs best option for node in these tiers to save cost? Is heap memory still a performance factor in these tiers?

What are some alternatives to storing big-data is ES if there is any?


The data tiering depends also on your querying, do you have a notion of time or age in your data ?

You could think about frozen and or cold indices if you dont need to acess the data on a regular bassis.

Data tiering is effective way to increase elastic storage capacity based on the fact that you might need to query only last 7 days and 1 time a year the frozen data.

Why would LVM save you any kind of space or increase storage effectiveness ?

Also if you're using enterprise grade storage solutions you should look into storage deduplication technology on your storage hardware it's actually insane how much data this can save you can go as high as 60%+ on some datasets.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.