I see this in the logs just before some other messages about disconnections:
[2019-03-21T08:00:55,999][WARN ][o.e.t.n.Netty4Transport ] [node1-100.107.135.3] exception caught on transport layer [[id: 0x18f1e218, L:/100.107.135.3:64102 - R:10.226.137.176/10.226.137.176:9400]], closing connec
java.lang.ClassCastException: null
I don't think this should be happening. However I've searched for an issue like this and cannot find anything that matches. The lack of a proper error message and stack trace is an obstacle, and is caused by running with the JVM option OmitStackTraceInFastThrow
which was the default prior to 6.0. Could you add
-XX:-OmitStackTraceInFastThrow
to the config/jvm.options
file on every node and perform a rolling restart? Unfortunately this will allocate primaries etc elsewhere so the problem might go away, but hopefully it won't and then we can see more context about that exception.