ElasticSearch 2.3.3 Cluster Disk High/Low Watermark Net Effect is Ambiguous

We are running into some problems with autoconfiguration of ElasticSearch. Mainly, the low disk watermark never gets logged and the high disk watermark seems to be ignored! This only seems to be a problem when we are using a single-node cluster. The re-allocation seems to work just fine when we add another system into the cluster.

We can freely delete extrenous shards (sub 1mb) using kopf/sense and they will prompty get recreated (against the documentation that new shards cannot be allocated).

The strangest part of the whole process is that we are properly logging the high watermark breach:

[2017-02-22 15:28:02,129][WARN ][cluster.routing.allocation.decider] [es-1] high disk watermark [2%] exceeded on [cVPjveSMQeS6d0oxq8axZA][es-1][/opt/evertz/insite/parasite/applications/es-1/data/Development-cluster/nodes/0] free: 27.1gb[49.3%], shards will be relocated away from this node

Does anyone have some information regarding the expected behavior of the settings mentioned at https://www.elastic.co/guide/en/elasticsearch/reference/2.3/disk-allocator.html and how they are supposed to interact with a single node cluster?

We have tried this for both absolute watermarks (in mb) and percentage watermarks.

Our _cluster/settings:

{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "include_relocations": "true",
            "threshold_enabled": "true",
            "watermark": {
              "low": "1%",
              "high": "2%"
            }
          }
        }
      },
      "info": {
        "update": {
          "interval": "10s"
        }
      }
    }
  },
  "transient": {}
}

Hey,

just to understand your question fully: Which behaviour for watermark checks do you expect when you only have a single node, thus no balancing can happen anyway?

--Alex

Hello,

The expected result on a single node cluster would be no new indices when the low watermark is reached with existing indices accepting new data and no new data accepted at all once the high watermark is reached.

The reason being that there are cases where long term mass storage is needed (infrequent access) where we never want to fill up anything more than 95% of the disk so that other applications such as our nodejs curator can compute stale indices without taking massive amounts of memory.

As it stands right now we have watched elasticsearch write to the final cylinder of our disk and then peg the system as it tries to write more.

Any thoughts?

  • Troy Heanssgen

Hey,

checking the source it is explicitely mentioned, that in case of only a single node the disk threshold decider is disabled and allocation is always allowed.

See https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/DiskThresholdDecider.java#L373

--Alex

Hello,

First of all thank you for the response. So if I understand correctly, there is absolutely no way outside of manual/automatic external management to limit the resources ElasticSearch uses?

The source seems to agree with your conclusion, however it seems fairly silly to let a database fill the entire disk unless it has another node in the cluster to talk to.

Apart from recompiling the source, is there a different sane way to limit disk usage in ElasticSearch?

  • Troy Heanssgen

Hey,

I opened https://github.com/elastic/elasticsearch/issues/23395

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.