How can we store large scale data with 32GB RAM / 30TB disk on machine

My problem is that:
I have some machine which has about 32GB RAM and 30 TB disk storage.
Once the disk usage reach 10%,the memory used may reach up to 90%.
Although I have 30TB but I can not use so much disk storage, because memory is limit.

most percent of RAM cost is FST term index,My ES is 2.3.X.
Please give me some advice!

Each shard comes with overhead in terms of memory and file handles, so in order to maximise the amount of data a node can hold, make sure your shards are in the tens of GB in size. Having said that, I doubt you will be able to utilise anywhere close to that amount of disk space given the limited amount of RAM you have.

Yeah, I had try some cases to solve my problem, include set each shards to the tens of GB in size. I found that:
in my case, one shard with 70GB in size, the shard may cost about 470 MB term_memory.
So if I used all of 32GB RAM, means that I can store about 32GB/470MB=68 shards. 68
shards can only store 70GB*68=4.7TB
:cry:

You can take a look at https://github.com/elastic/elasticsearch/issues/24269 and vote for it. But IMHO this PR would never accepted by the Elastic team not in 2.x nor in master branch (I'll be happy to be wrong).

Good, may be we can pull some change to lucene core to unload indices?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.