Fatal error with ingest-attachment plugin

I am getting a fatal error when trying to index the attached stripped down document using the ingest-attachment plugin. This causes my cluster to reboot and does not give me an email notification. It looks to be having a problem with the embedded Visio diagram.

Version: 5.1.1
ClusterId: 7e7501
Node: instance-0000000005
Plugin: ingest-attachment

Link to problematic doc: https://1drv.ms/w/s!ApTXXtrEV_GGiosenEfoUSk1rRnuYA

Error Information
Dec 8 21:42:34 ERROR org.elasticsearch.bootstrap.ElasticsearchUncaughtExceptionHandler i5@z0

[2016-12-08T21:42:34,628][ERROR][org.elasticsearch.bootstrap.ElasticsearchUncaughtExceptionHandler] fatal error in thread [elasticsearch[index][T#1]], exiting java.lang.NoClassDefFoundError: com/graphbuilder/curve/Point at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:1.8.0_72] at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) ~[?:1.8.0_72] at java.lang.Class.getConstructor0(Class.java:3075) ~[?:1.8.0_72] at java.lang.Class.getDeclaredConstructor(Class.java:2178) ~[?:1.8.0_72] at org.apache.poi.xdgf.util.ObjectFactory.put(ObjectFactory.java:34) ~[?:?] at org.apache.poi.xdgf.usermodel.section.geometry.GeometryRowFactory.(GeometryRowFactory.java:39) ~[?:?] at org.apache.poi.xdgf.usermodel.section.GeometrySection.(GeometrySection.java:55) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFSheet.(XDGFSheet.java:77) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFShape.(XDGFShape.java:113) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFShape.(XDGFShape.java:107) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFBaseContents.onDocumentRead(XDGFBaseContents.java:82) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFMasterContents.onDocumentRead(XDGFMasterContents.java:66) ~[?:?] at org.apache.poi.xdgf.usermodel.XDGFMasters.onDocumentRead(XDGFMasters.java:101) ~[?:?] at org.apache.poi.xdgf.usermodel.XmlVisioDocument.onDocumentRead(XmlVisioDocument.java:106) ~[?:?] at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190) ~[?:?] at org.apache.poi.xdgf.usermodel.XmlVisioDocument.(XmlVisioDocument.java:79) ~[?:?] at org.apache.poi.xdgf.extractor.XDGFVisioExtractor.(XDGFVisioExtractor.java:41) ~[?:?] at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:207) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:86) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87) ~[?:?] at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ~[?:?] at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) ~[?:?] at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) ~[?:?] at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedFile(AbstractOOXMLExtractor.java:311) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedParts(AbstractOOXMLExtractor.java:202) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:115) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:112) ~[?:?] at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87) ~[?:?] at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ~[?:?] at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) ~[?:?] at org.apache.tika.Tika.parseToString(Tika.java:568) ~[?:?] at org.elasticsearch.ingest.attachment.TikaImpl$1.run(TikaImpl.java:94) ~[?:?] at org.elasticsearch.ingest.attachment.TikaImpl$1.run(TikaImpl.java:91) ~[?:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_72] at org.elasticsearch.ingest.attachment.TikaImpl.parse(TikaImpl.java:91) ~[?:?] at org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:72) ~[?:?] at org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100) ~[elasticsearch-5.1.1.jar:5.1.1] at org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:58) ~[elasticsearch-5.1.1.jar:5.1.1] at org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:166) ~[elasticsearch-5.1.1.jar:5.1.1] at org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:41) ~[elasticsearch-5.1.1.jar:5.1.1] at org.elasticsearch.ingest.PipelineExecutionService$1.doRun(PipelineExecutionService.java:65) ~[elasticsearch-5.1.1.jar:5.1.1] at ....

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Thanks for reporting. Indeed we don't include all dependencies (some of them are playing very badly with the security manager for example) so this is a side effect of it.

But I think we should do a better job and try to send a cleaner exception in such a case.

Could you open an issue in elasticsearch repo so we can either try to add missing librairies if possible and/or catch this situation a bit better?

Issue #22077 has been created

1 Like

Awesome. I proposed a patch to fix the fact it kills your node:

About being able to actually parse your doc, I will try to add missing libs and if there is no issue by adding them I'll propose a PR to add support for Visio docs as well (including when they are embedded in a word document).

Thanks for all the details. It has been super helpful!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.