There is enough free space (root, data paths, tmp - multiple disks are used) and memory.
java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
so the error in the log is a SIGBUS error, which is pretty generic and can hint at a hardware or a software failure. The JVM you are using already is a bit older, maybe you can try to upgrade that one first and see if the problem persists?
What is the nature of those crashes? Doing the same operation or rather random? At the same time of the day? How about the frquency?
The cluster is running since a year (Upgrade to 6.2.2 one month ago; 10 nodes on bare metal, hot-warm architecture) without problems. There were 3 crashes on the same node (a warm node) in the last 7 days at different time and I also can't see any context to other actions. At the time of the crash there were no indexing/allocation/snapshot jobs running - only queries. So for me it is random.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.