My problem is that:
I have some machine which has about 32GB RAM and 30 TB disk storage.
Once the disk usage reach 10%,the memory used may reach up to 90%.
Although I have 30TB but I can not use so much disk storage, because memory is limit.
most percent of RAM cost is FST term index,My ES is 2.3.X.
Please give me some advice!
Each shard comes with overhead in terms of memory and file handles, so in order to maximise the amount of data a node can hold, make sure your shards are in the tens of GB in size. Having said that, I doubt you will be able to utilise anywhere close to that amount of disk space given the limited amount of RAM you have.
Yeah, I had try some cases to solve my problem, include set each shards to the tens of GB in size. I found that:
in my case, one shard with 70GB in size, the shard may cost about 470 MB term_memory.
So if I used all of 32GB RAM, means that I can store about 32GB/470MB=68 shards. 68
shards can only store 70GB*68=4.7TB
You can take a look at https://github.com/elastic/elasticsearch/issues/24269 and vote for it. But IMHO this PR would never accepted by the Elastic team not in 2.x nor in master branch (I'll be happy to be wrong).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.