Indexation fails quitely and make the client hangs

Hi,

I installed a custom ES plugin into my ES server, to have some custom Lucene analyzers available to index and do text search. I use the released 0.16.1 version.

Then I got my java client which is sending json document to be indexed hangs forever. jstacks and logs didn't show me anything wrong on elasticsearch.
Then I started debug on my local machine and starting the server manually, and then I could see a stack trace (see the end of the mail). This is a NoClassDefFoundError, not an Exception but an Error. Since there are only some catch of Exception, this doesn't goes into a logger but directly into sysout.
Then probably in TransportShardReplicationOperationAction#performOnPrimary, the try-catch should catch a Throwable rather than an Exception ? And in many other places too ?

Nicolas

---------------- stack trace ----------------

Exception in thread "elasticsearch[index]-pool-2-thread-1" java.lang.NoClassDefFoundError: com/ibm/icu/text/Transliterator
at it.scoop.module.search.analysis.AccentFilter.(AccentFilter.java:15)
at it.scoop.module.search.es.AccentFilterFactory.create(AccentFilterFactory.java:17)
at org.elasticsearch.index.analysis.CustomAnalyzer.buildHolder(CustomAnalyzer.java:85)
at org.elasticsearch.index.analysis.CustomAnalyzer.reusableTokenStream(CustomAnalyzer.java:73)
at org.elasticsearch.index.analysis.NamedAnalyzer.reusableTokenStream(NamedAnalyzer.java:81)
at org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:58)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:123)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:248)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:701)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2042)
at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:459)
at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:374)
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:292)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:183)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.ClassNotFoundException: com.ibm.icu.text.Transliterator
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 20 more

Yea, that failure will not be caught, and then the request will not return. Its a config error though, once you place the relevant jars in the classpath, then it will work.

On Wednesday, June 1, 2011 at 11:30 AM, Nicolas Lalevée wrote:

Hi,

I installed a custom ES plugin into my ES server, to have some custom Lucene analyzers available to index and do text search. I use the released 0.16.1 version.

Then I got my java client which is sending json document to be indexed hangs forever. jstacks and logs didn't show me anything wrong on elasticsearch.
Then I started debug on my local machine and starting the server manually, and then I could see a stack trace (see the end of the mail). This is a NoClassDefFoundError, not an Exception but an Error. Since there are only some catch of Exception, this doesn't goes into a logger but directly into sysout.
Then probably in TransportShardReplicationOperationAction#performOnPrimary, the try-catch should catch a Throwable rather than an Exception ? And in many other places too ?

Nicolas

---------------- stack trace ----------------

Exception in thread "elasticsearch[index]-pool-2-thread-1" java.lang.NoClassDefFoundError: com/ibm/icu/text/Transliterator
at it.scoop.module.search.analysis.AccentFilter.(AccentFilter.java:15)
at it.scoop.module.search.es.AccentFilterFactory.create(AccentFilterFactory.java:17)
at org.elasticsearch.index.analysis.CustomAnalyzer.buildHolder(CustomAnalyzer.java:85)
at org.elasticsearch.index.analysis.CustomAnalyzer.reusableTokenStream(CustomAnalyzer.java:73)
at org.elasticsearch.index.analysis.NamedAnalyzer.reusableTokenStream(NamedAnalyzer.java:81)
at org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:58)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:123)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:248)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:701)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2042)
at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:459)
at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:374)
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:292)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:183)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.ClassNotFoundException: com.ibm.icu.text.Transliterator
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 20 more

Le 1 juin 2011 à 15:37, Shay Banon a écrit :

Yea, that failure will not be caught, and then the request will not return. Its a config error though, once you place the relevant jars in the classpath, then it will work.

Yes, I already fixed my config and now everything works like a charm. And I have set some timeout on the client side so it won't hang long if anything goes really wrong.

My worry here is about the quietness of elasticsearch, on thrown Throwable which are not Exception. Especially for things like OutOfMemoryError and StackOverflowError.
Hence my suggestion of catching Throwable rather than an Exception. Another thing that can be done is setting an UncaughtExceptionHandler on the thread so the exception can get logged properly, but this won't fix the hanging client.
But I'm not enough confident with that part of the code to suggest patches.

Nicolas

On Wednesday, June 1, 2011 at 11:30 AM, Nicolas Lalevée wrote:

Hi,

I installed a custom ES plugin into my ES server, to have some custom Lucene analyzers available to index and do text search. I use the released 0.16.1 version.

Then I got my java client which is sending json document to be indexed hangs forever. jstacks and logs didn't show me anything wrong on elasticsearch.
Then I started debug on my local machine and starting the server manually, and then I could see a stack trace (see the end of the mail). This is a NoClassDefFoundError, not an Exception but an Error. Since there are only some catch of Exception, this doesn't goes into a logger but directly into sysout.
Then probably in TransportShardReplicationOperationAction#performOnPrimary, the try-catch should catch a Throwable rather than an Exception ? And in many other places too ?

Nicolas

---------------- stack trace ----------------

Exception in thread "elasticsearch[index]-pool-2-thread-1" java.lang.NoClassDefFoundError: com/ibm/icu/text/Transliterator
at it.scoop.module.search.analysis.AccentFilter.(AccentFilter.java:15)
at it.scoop.module.search.es.AccentFilterFactory.create(AccentFilterFactory.java:17)
at org.elasticsearch.index.analysis.CustomAnalyzer.buildHolder(CustomAnalyzer.java:85)
at org.elasticsearch.index.analysis.CustomAnalyzer.reusableTokenStream(CustomAnalyzer.java:73)
at org.elasticsearch.index.analysis.NamedAnalyzer.reusableTokenStream(NamedAnalyzer.java:81)
at org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:58)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:123)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:248)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:701)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2042)
at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:459)
at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:374)
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:292)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:183)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.ClassNotFoundException: com.ibm.icu.text.Transliterator
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 20 more

Yea, it requires some thought. There is actually specific handling for OOM (which is the most common one).

On Wednesday, June 1, 2011 at 5:04 PM, Nicolas Lalevée wrote:

Le 1 juin 2011 à 15:37, Shay Banon a écrit :

Yea, that failure will not be caught, and then the request will not return. Its a config error though, once you place the relevant jars in the classpath, then it will work.

Yes, I already fixed my config and now everything works like a charm. And I have set some timeout on the client side so it won't hang long if anything goes really wrong.

My worry here is about the quietness of elasticsearch, on thrown Throwable which are not Exception. Especially for things like OutOfMemoryError and StackOverflowError.
Hence my suggestion of catching Throwable rather than an Exception. Another thing that can be done is setting an UncaughtExceptionHandler on the thread so the exception can get logged properly, but this won't fix the hanging client.
But I'm not enough confident with that part of the code to suggest patches.

Nicolas

On Wednesday, June 1, 2011 at 11:30 AM, Nicolas Lalevée wrote:

Hi,

I installed a custom ES plugin into my ES server, to have some custom Lucene analyzers available to index and do text search. I use the released 0.16.1 version.

Then I got my java client which is sending json document to be indexed hangs forever. jstacks and logs didn't show me anything wrong on elasticsearch.
Then I started debug on my local machine and starting the server manually, and then I could see a stack trace (see the end of the mail). This is a NoClassDefFoundError, not an Exception but an Error. Since there are only some catch of Exception, this doesn't goes into a logger but directly into sysout.
Then probably in TransportShardReplicationOperationAction#performOnPrimary, the try-catch should catch a Throwable rather than an Exception ? And in many other places too ?

Nicolas

---------------- stack trace ----------------

Exception in thread "elasticsearch[index]-pool-2-thread-1" java.lang.NoClassDefFoundError: com/ibm/icu/text/Transliterator
at it.scoop.module.search.analysis.AccentFilter.(AccentFilter.java:15)
at it.scoop.module.search.es.AccentFilterFactory.create(AccentFilterFactory.java:17)
at org.elasticsearch.index.analysis.CustomAnalyzer.buildHolder(CustomAnalyzer.java:85)
at org.elasticsearch.index.analysis.CustomAnalyzer.reusableTokenStream(CustomAnalyzer.java:73)
at org.elasticsearch.index.analysis.NamedAnalyzer.reusableTokenStream(NamedAnalyzer.java:81)
at org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:58)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:123)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:248)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:701)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2042)
at org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:459)
at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:374)
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:292)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:183)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:418)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.access$100(TransportShardReplicationOperationAction.java:233)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:331)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.ClassNotFoundException: com.ibm.icu.text.Transliterator
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 20 more