Using repository-hdfs on (different) YARN cluster

(Johannes Zillmann) #1

Hey there, question!


  • i have an ES cluster running on top of YARN via elasticsearch-yarn
  • the ES cluster has the plugin repository-hdfs installed

What is the best way to ensure the plugin's hadoop-libs are matching those on the cluster ?

Checkout and build the plugin with the correct flavour/hadoop-version (like described in Issues with using repository-hdfs plug in for snapshot/restore operation)?

Or is there anyway to tell the plugin that it makes use of the YARN/Hadoop-Classpath instead of its own plugin-lib folder ? (that would be preferred because i want to run ES on various different Hadoop distributions!)

Any input appreciated!

(Costin Leau) #2

Moving forward the plugin will work with Hadoop libraries that are embedded/known in the lib folder. And the reason for this being security - delegating to the classpath means we don't know the code source, that is what libraries are used as oppose to the plugin / embedded approach where their location is clearly known.

So while it is a bit more work, selecting the jars and putting them in the plugin (symlink or mounted path can alleviate this problem) folder is the solution regardless of the Hadoop flavor/platform.

(system) #3