Mistakenly identifies Multiple ES Hadoop Versions on Azure


(Sor123) #1

Hi,

I'm running ES-Hadoop on an Azure HDinsight cluster. When I register version 2.3.0 in Pig it immediately fails with a "multiple ES-Hadoop versions" error. I created a new cluster and only registered one version. So I don't know if this is a bug that it finds multiple versions or if there is any other issue.

See below for the error message. It lists the same path twice with just the minor difference of the lower-case versus upper-case drive C/:. Obviously these are not different versions. Is this sthg that can be fixed or prevented?

Error:
Multiple ES-Hadoop versions detected in the classpath; please use only one
jar:file:/c:/apps/temp/hdfs/nm-local-dir/usercache/userxyz/appcache/application_xyz/container_xyz/elasticsearch-hadoop-2.3.0.jar
jar:file:/C:/apps/temp/hdfs/nm-local-dir/usercache/userxyz/appcache/application_xyz/container_xyz/elasticsearch-hadoop-2.3.0.jar

Thanks


(Costin Leau) #2

That looks like a URL problem on Azure. Since the classpath are actually case sensitive the two resources do register twice so as far as the connector is concerned, there is indeed a classpath issue.
The whole reason this was added in the first place was to prevent the oh so common bug where multiple versions were added in the classpath (folks where just adding jars without checking) which led to very weird bugs and class conflicts.
After the check was added, this problem was pretty much eliminated.

Back to this case - the fact that a drive appears twice (c: vs C:) should be something configurable in your Azure cluster; maybe there's a symlink somewhere or the JVM classpath contains the drive registered twice.


(Sor123) #3

thanks for the answer.
i haven't found a way to resolve this yet - but i'll keep looking or maybe work with a custom jar.


(system) #4