Fatal error occuring on all nodes within the hour

Using 6.8.6 from the oss rpm for redhat and jvm openjdk-1.8.0.252.b09-2.el7_8 on Centos 7, we've been getting crashes every second day now on almost all nodes within the hour. I managed to catch the log during the crash:

[2020-05-04T08:31:08,354][INFO ][o.e.c.m.MetaDataMappingService] [node84-indexer] [syslog-2020.05.04/-QEZ7dwzSD-2TYIaGUI2cA] update_mapping [_doc]
[2020-05-04T08:31:09,045][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [ccsvli84-coloss-indexer] fatal error in thread [elasticsearch[ccsvli84-coloss-indexer][search][T#15]], exiting
java.lang.InternalError: java.io.IOException: Stream closed
        at sun.util.locale.provider.BreakIteratorProviderImpl.getBreakInstance(BreakIteratorProviderImpl.java:178) ~[?:?]
        at sun.util.locale.provider.BreakIteratorProviderImpl.getSentenceInstance(BreakIteratorProviderImpl.java:148) ~[?:?]
        at java.text.BreakIterator.createBreakInstance(BreakIterator.java:574) ~[?:1.8.0_242]
        at java.text.BreakIterator.createBreakInstance(BreakIterator.java:553) ~[?:1.8.0_242]
        at java.text.BreakIterator.getBreakInstance(BreakIterator.java:544) ~[?:1.8.0_242]
        at java.text.BreakIterator.getSentenceInstance(BreakIterator.java:531) ~[?:1.8.0_242]
        at org.apache.lucene.search.uhighlight.BoundedBreakIteratorScanner.getSentence(BoundedBreakIteratorScanner.java:151) ~[elasticsearch-6.8.6.jar:7.7.2 d4c30fc2856154f2c1fefc589eb7cd070a415b94 - janhoy - 2019-05-28 23:31:35]
        at org.elasticsearch.search.fetch.subphase.highlight.UnifiedHighlighter.getBreakIterator(UnifiedHighlighter.java:193) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.search.fetch.subphase.highlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:118) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.search.fetch.subphase.highlight.HighlightPhase.hitExecute(HighlightPhase.java:107) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:167) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.search.SearchService.lambda$executeFetchPhase$3(SearchService.java:541) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:381) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-6.8.6.jar:6.8.6]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.6.jar:6.8.6]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_242]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
Caused by: java.io.IOException: Stream closed
        at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:159) ~[?:1.8.0_242]
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_242]
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_242]
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_242]
        at java.io.FilterInputStream.read(FilterInputStream.java:107) ~[?:1.8.0_242]
        at sun.util.locale.provider.RuleBasedBreakIterator.readFile(RuleBasedBreakIterator.java:462) ~[?:?]
        at sun.util.locale.provider.RuleBasedBreakIterator.readTables(RuleBasedBreakIterator.java:375) ~[?:?]
        at sun.util.locale.provider.RuleBasedBreakIterator.<init>(RuleBasedBreakIterator.java:321) ~[?:?]
        at sun.util.locale.provider.BreakIteratorProviderImpl.getBreakInstance(BreakIteratorProviderImpl.java:169) ~[?:?]
        ... 19 more

It seems to be a IO exception. What type of storage are you using? What type of hardware is this cluster deployed on?

Bare-metal servers with local hardware RAID storage.
I checked system and hardware logs and there is nothing suspicious.

I suspect a JDK bug here: JDK8 converts a nonfatal IOException into a fatal InternalError here:

        } catch (IOException | MissingResourceException | IllegalArgumentException e) {
            throw new InternalError(e.toString(), e);
        }

in JDK10 it doesn't:

        } catch (MissingResourceException | IllegalArgumentException e) {
            throw new InternalError(e.toString(), e);
        }

I haven't dug in depth into the reasons surrounding this change.

(edit: added links to the JDK source)

1 Like

wow, very interesting find!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.