That depends on how large the disk is and what your use case is.
If you are talking a 4TB drive that Elasticsearch can use, then 99% leaves 41GB to work with. That's not a lot if Elasticsearch needs to merge a large index.
Our typical use case at customer end is 2TB. I wanted dig down the use cases in which will be needed of space? One of the reason that I could think of is disk expansion as elastic support only single path per node. Could please elaborate the merge index use case with some references?
The underlying Lucene segments of an index will merge over time. So if you have 2 segments that are (eg) 5 gig, then you need 5 x 2plus another 10 gig for the merge to happen in, until the merge is completed and then the original segments are deleted.
Thanks for this explanation. We will be having shard of 50 GB each which would be heavily ingested with user email data. I would like to what is the default segment size, how frequently segments are getting merged, is there any max segment merge threshold parameter?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.