We are loading data into ES using es-hadoop, and some nodes are exceeding the high watermark.
If I set the low watermark to 80%, will es-hadoop redirect the primary shard writes to only nodes that have available capacity?
Example Setting:
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "80%",
}
}
@JustinKuspa ES-Hadoop cannot redirect writes in that way. If a node it is communicating with stops responding it will try to re-route the write request in hopes that the cluster will have recovered the primary and is able to continue accepting writes, but if the primary is hosted on a node that is seeing disk space issues, there's not much that the connector can do about that. Redirecting the request in that case will just have a different node attempt to write the data to the primary which is still on a node with low disk space.
Currently there is no logic for detecting free disk space and changing which nodes to write to in ES-Hadoop. As I mentioned before though, if a primary shared is sitting on one of those nodes with limited remaining space, even if we strike the node from our list of write nodes, any node that we write to will end up having to route the request to the node anyway as all writes must be accepted by the primary shard before replicating them.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.