I am upgrading my ElasticSearch Plugin from ES 1.x to ES 2.3.5 . I made the required changes and build it using Maven. When I tried to install the plugin with ElasticSearch , it threw following error :
we have added the JAR hell check in Elasticsearch 2 to avoid issues with duplicate classes on the classpath. As both JARs contain the same class, you need to get rid of one of them. You can check the dependency tree of your plugin with mvn dependency:tree and then explicitly exclude one of them in your POM (the more likely candidate for exclusion is javax.json-1.0.4.jar).
sorry to hear that. In Elasticsearch 2 we put a lot of effort into robustness of Elasticsearch and the JARHell check is one of the measures we took. We also run it in each build of Elasticsearch itself. The reason for that is that we want to avoid having duplicate classes on the classpath which can lead to bugs which are really hard to track down as they depend on the order in which the classes get loaded.
So we want to avoid the root cause by not allowing duplicate classes at all. Thus, the only way to prevent this error is eliminating the root cause, i.e. getting rid of duplicate classes in the first place.
Interesting. Just out of curiousity, what is Elasticsearch using to check for this? I was playing with JHades earlier and found a few potential conflicts in my own code base but it is kind of ugly since it prints to System.out and lacks a bit in flexibility. However, given the stuff it found, I need more checks like this definitely.
In Java, it is perfectly valid to have two identical classes on the class path, because only the first one will be loaded. This is part of the Java Language Specification, implemented by JVM caching.
ES is checking for duplicate class names. It does not detect identical classes by checking the binary representations of the class. So the check is too strict. It's the binary representation the JVM caches, not the class name (which is given by classloader and binary name of class).
In particular, a class loader may cache binary representations of classes and interfaces, prefetch them based on expected usage, or load a group of related classes together. These activities may not be completely transparent to a running application.
By binary representation caching, the JVM ensures that identical classes are loaded only once during the JVM lifetime, no matter what class loader is in use. That's also the reason why the method findLoadedClass exists, which can be used to check for a class being in the cache and loaded by this class loader or not ClassLoader (Java Platform SE 8 )
That's right. In that respect we're even more strict. So I was not entirely precise by saying we don't allow duplicate classes. What I meant was that we don't allow two classes with the same class name on the classpath.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.