How much external storage can be added to a elastic node


(Tamizharasan) #1

I am try setting up a cluster and every node specifications are 4GB RAM, 2 CPU, 60 GB internal hard disk. By using this configuration how much external storage disk (SSD) I can add for elasticsearch data storage without compromising the indexing and searching performance?

Can anyone suggest me the maximum storage configuration? Because external storage is cheaper compared to the higher volume nodes.


(Mark Walkom) #2

You need to test that against your data and query types.

But if you look at Cloud for an example, they have a 1:24 ratio, so that's somewhere to start.


(Tamizharasan) #3

I already attached a 300 GB of external ssd and testing it. But i still have doubt on how much external storage would be better. can you elaborate the 1:24 ratio, i didn't hear it anywhere.
Thanks for your fast response :slight_smile:


(Mark Walkom) #4

In Elastic Cloud we have 1GB of memory to 24GB of disk.
So that is a place to start in terms of ratios.


(Christian Dahlqvist) #5

The RAM to disk ratio will depend a lot on your use case, query patterns and latency requirements. 1:24 is not in any way a hard limit for Elasticsearch, but rather what we use in Elastic Cloud as it is suitable for a wide variety of use cases. I do agree with mark that it is a good starting point though.


(Tamizharasan) #6

@Christian_Dahlqvist Really Thanks for your viewpoint. My use case is every day collecting the data and index it in elasticsearch (monthly index).
same time I need the data for full-text search. I am using last one year data for the search. so every query will hit only 12 indexes. I am creating the enterprise application so the query will be less. And older data (more than a year old) queried really rare.
So can you suggest me the best ratio for this case ? It will help me a lot. Anyhow I will test the ratio before going live. But I need some suggestion to start to build the test cluster. If you know any official reference link it will be helpful to me. Thanks in advance :slight_smile:


(Christian Dahlqvist) #7

The ideal ratio will depend on your hardware and your index and query patterns, so the only way to really know is to benchmark with realistic data and queries on the actual hardware. We talked about cluster sizing at Elastic{ON}, and this might give you an idea about how to go about determining the ratio for your use case and hardware.


(Tamizharasan) #8

We only want the full-text search in our application as of now. Five variables in our JSON are more than enough for that full-text search, But we don't want to lose the data that we are getting every day.

I read about Hadoop and it can be used as data storage. As I read, it cheap compared to es in data storage. So storing the full data in Hadoop and store the required fields in elasticsearch. And when we need the analytics in our app we can use es-Hadoop to handle that scenario.
This decision is not yet final.

We are small startup can't spend much money on expensive data storage. Our main goal is to reduce the cost, use minimum data for search and store the older data too for further development in future. Some better suggestion will help me a lot. Thanks in advance


#9

What is your size of single record (including all the fields) ? How much data you expect per day? With this you can calculate the need for a month and then for year. Do sizing based on your data.

Keep only required data in Elastic search. Store your data (complete data set) externally in separate NoSQL store or file system.


(system) #10