Hello! Thanks for help in advance
I've running a 3-nodes cluster (es1, es2, es3). Recently, after VM restart (es2 node) it constantly crashes after several hours of work with SIGSEGV.
Here is full error from journalctl:
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # A fatal error has been detected by the Java Runtime Environment:
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # SIGSEGV (0xb) at pc=0x00007f293db4c40b, pid=44800, tid=59146
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # JRE version: OpenJDK Runtime Environment (21.0.1+12) (build 21.0.1+12-29)
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # Java VM: OpenJDK 64-Bit Server VM (21.0.1+12-29, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # Problematic frame:
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # J 45401 c2 org.apache.lucene.util.compress.LZ4.compressWithDictionary([BIIILorg/apache/lucene/store/DataOutput;Lorg/apache/lucene/util/compress/LZ4$HashTable;)V org.apache.lucene.core@9.8.0 (378 bytes) @ 0x00007f293db4c40b [0x00007f293db4bd80+0x000000000000068b]
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /usr/share/elasticsearch/core.44800)
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # An error report file with more information is saved as:
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # /var/log/elasticsearch/hs_err_pid44800.log
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: [49810.564s][warning][os] Loading hsdis library failed
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # If you would like to submit a bug report, please visit:
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: # https://bugreport.java.com/bugreport/crash.jsp
Jan 16 23:28:27 es2 systemd-entrypoint[44800]: #
Jan 16 23:28:31 es2 systemd-entrypoint[44715]: ERROR: Elasticsearch exited unexpectedly, with exit code 134
Jan 16 23:28:31 es2 systemd[1]: elasticsearch.service: Main process exited, code=exited, status=134/n/a
-- Subject: Unit process exited
Problematic frame seems similar everytime, last time it was # J 40304% c2 org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum._next()Lorg/apache/lucene/util/BytesRef; org.apache.lucene.core@9.8.0 (828 bytes) @ 0x00007fa8cea18a95 [0x00007fa8cea18320+0x0000000000000775]
every time there is org.apache.lucene.core@9.8.0
Any ideas how to fix those crashes?
Elasticsearch version - 8.11.1
OpenJDK 64-Bit Server VM (21.0.1+12-29)