Disk usage threshold

I'm planning integrate Elasticsearch as a single node cluster per application server. Application server is getting hosted in customer environment.

For disk threshold, I'm planning to use only disk flood_stage configuration as low and high watermark will have no impact.

I know the default flood stage is set to 95% but would like to know the consequences of keeping it at 99% by default?

That depends on how large the disk is and what your use case is.

If you are talking a 4TB drive that Elasticsearch can use, then 99% leaves 41GB to work with. That's not a lot if Elasticsearch needs to merge a large index.

Our typical use case at customer end is 2TB. I wanted dig down the use cases in which will be needed of space? One of the reason that I could think of is disk expansion as elastic support only single path per node. Could please elaborate the merge index use case with some references?

The underlying Lucene segments of an index will merge over time. So if you have 2 segments that are (eg) 5 gig, then you need 5 x 2 plus another 10 gig for the merge to happen in, until the merge is completed and then the original segments are deleted.

1 Like

Thanks for this explanation. We will be having shard of 50 GB each which would be heavily ingested with user email data. I would like to what is the default segment size, how frequently segments are getting merged, is there any max segment merge threshold parameter?

There is no default segment size and you cannot limit the size of a segment, Elasticsearch handles things transparently.

Merge | Elasticsearch Guide [7.13] | Elastic has some info on this.

1 Like