How can we store large scale data with 32GB RAM / 30TB disk on machine


#1

My problem is that:
I have some machine which has about 32GB RAM and 30 TB disk storage.
Once the disk usage reach 10%,the memory used may reach up to 90%.
Although I have 30TB but I can not use so much disk storage, because memory is limit.

most percent of RAM cost is FST term index,My ES is 2.3.X.
Please give me some advice!


(Christian Dahlqvist) #2

Each shard comes with overhead in terms of memory and file handles, so in order to maximise the amount of data a node can hold, make sure your shards are in the tens of GB in size. Having said that, I doubt you will be able to utilise anywhere close to that amount of disk space given the limited amount of RAM you have.


#3

Yeah, I had try some cases to solve my problem, include set each shards to the tens of GB in size. I found that:
in my case, one shard with 70GB in size, the shard may cost about 470 MB term_memory.
So if I used all of 32GB RAM, means that I can store about 32GB/470MB=68 shards. 68
shards can only store 70GB*68=4.7TB
:cry:


ES as a long-term storage system inside analytics architecture?
#4

You can take a look at https://github.com/elastic/elasticsearch/issues/24269 and vote for it. But IMHO this PR would never accepted by the Elastic team not in 2.x nor in master branch (I'll be happy to be wrong).


#5

Good, may be we can pull some change to lucene core to unload indices?


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.