Does es-hadoop honor disk watermarks?

We are loading data into ES using es-hadoop, and some nodes are exceeding the high watermark.
If I set the low watermark to 80%, will es-hadoop redirect the primary shard writes to only nodes that have available capacity?

Example Setting:
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "80%",
}
}

@JustinKuspa ES-Hadoop cannot redirect writes in that way. If a node it is communicating with stops responding it will try to re-route the write request in hopes that the cluster will have recovered the primary and is able to continue accepting writes, but if the primary is hosted on a node that is seeing disk space issues, there's not much that the connector can do about that. Redirecting the request in that case will just have a different node attempt to write the data to the primary which is still on a node with low disk space.

Thanks James:

A better example:

Say I have 10 nodes, and 3 of them are above the low watermark before the es-hadoop job starts.

Will es-hadoop exclude these from the list of possible nodes to write to during the negoiation phase?

Currently there is no logic for detecting free disk space and changing which nodes to write to in ES-Hadoop. As I mentioned before though, if a primary shared is sitting on one of those nodes with limited remaining space, even if we strike the node from our list of write nodes, any node that we write to will end up having to route the request to the node anyway as all writes must be accepted by the primary shard before replicating them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.