A ruined shard caused ES node down

Does anyone encounter this kind of problem? A ruined but appeared as healthy shard caused the ES node shut down, and a memory dump is created.

The ES we used is 5.4.1. We have an index with more than 200GB data, and 12 shards. Days ago, one node with its one shard was shut down. Even we could start the node again and the cluster turnned green, after a while, the node was shut down again. A few times we tried, it always got shut down. [ During this period, a few logstash instances inserted data to the cluster, including this index. ]

At first, we guessed that the node may have heavy burden, so we tried to move the shard to another node. It's weird that the node could not be moved. The reroute command always failed. But when we moved other shard, even bigger, on this node, they always successed.

So we considered that this shard has been ruined, even it showed healthy. After we removed this index completely, the node returnned normal.

Did anyone encounter this kind of promblem before?
Any idea are appreciated.

Which version?
Can you share the logs before it crashes?

The ES version is 5.4.1. There is no related logs about this crash. We just observed some memory dump generated at that moment.
Unfortunately, due to the system limitation, only a few lines in the dump.

Is there any possiblity that an health-like shard but with abnormal data could lead to the node crash?

Is there any possiblity that an health-like shard but with abnormal data could lead to the node crash?

Well. Not on purpose. @jasontedor does this remind you anything?

The ES version is 5.4.1.

Could upgrade to latest version?

What is your exact JVM version? java -version

1.8.0_66-b17

Can you upgrade to latest JVM version? For example:

java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

Thanks for your suggestion. And sure, we could do that.

Is there any bug reported related to old jdk?

Officially we do support Oracle JVM 1.8u60+ and IcedTea OpenJDK 1.8.0.111+.

See https://www.elastic.co/support/matrix#matrix_jvm

What is your vendor?

We are using Oracle JVM. In most cases, the ES works excellently.
Later, as you suggested, we plan upgrading to latest JDK 1.8.

Sorry, it does not. The description is lacking sufficient detail (logs, error messages, etc.) for us to triage this one.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.