Why watermark settings are set so low by default?

ivten · December 15, 2017, 3:47pm

Hi,

I am running a cluster in which all datanodes have their data dirs mounted on separate 6TB disks.
Today the cluster reached 84% disk usage and I was forced to delete a bunch of old indexes even though I still had almost 1TB free on each node.

Since the disks fils up equally the use case of having one node running out of disk is not relevant for me.

I am now considering increasing the defaults to something like 95%, but then I started wondering whether there is some reason behind having it set to 85%/90% by default?

Is there any special reason for those defaults and will it be fine if I increase to 95% or even higher?

My reasoning is that currently the cluster will anyway stop indexing when it reaches 85% (will stop allocate new shards on all datanodes), which will also happen if the cluster runs completely out of disk space. So in this case best to increase the defaults or disable disk usage checking at all. Does this makes sense?

antonpious · December 16, 2017, 4:57am

How many nodes do you have? Is all of them reaching 85% water mark at the same time.

The 85 % water mark is to say that no shard movement would take place on that node which has the disk above 85%. This has nothing to do with indexing of new records on the existing shards.

The watermark would be helpful for indexing new records into existing shards. New shards would not be created for existing indices.

If your use case is to create new indices which would create new shards, you can increase the disk water mark.

When the disk completely runs out of disk space, the indices present in the node becomes in red state. Depending on the replication set, you may or may not recover the index.

Christian_Dahlqvist · December 16, 2017, 5:27am

As the threshold is percentage based, the amount of disk space left when the threshold is reached will vary depending on the size of the disk. This spare space is needed to handle merging, which can temporarily double the amount of space used by a shard. If you have large disks and most indices are not actively written to (and therefore less likely to merge), you can increase this parameter. Exactly how much spare space you need will however depend on your use case. You want to make sure you never run out of disk space.

ivten · December 18, 2017, 12:41pm

Thanks for the clear answers.

Some more explanations to your questions: I run 3 data nodes and each of them have 6TB. They also fill up equally and reach 85% at the same time. I also create daily indices so having all of them att 85% means that the new indices won't be created on the next day.

I will increate the defaults but will leave some space for merging and to avoid getting red indices as you suggested.

system · January 15, 2018, 12:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Disk Allocation Threshold Elasticsearch	1	446	July 6, 2017
ElasticSearch 2.3.3 Cluster Disk High/Low Watermark Net Effect is Ambiguous Elasticsearch	6	1055	March 27, 2017
High flood state disk watermark results in read only indexes Elasticsearch	3	5268	March 31, 2018
Low disk watermark - Does it ES stop indexing new data? Elasticsearch	2	2452	June 30, 2017
Low disk watermark [15%] exceeded on Elasticsearch	13	5359	July 5, 2017

Why watermark settings are set so low by default?

Related topics