The failed drive is not accessible anymore so those Elasticsearch logs are gone
On a identical VM I checked the mounting config
/dev/mapper/main-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
On other nodes in the cluster I see these as I killed the "bad" VM named es-hay0-18 (as expected)
[2019-04-18T09:24:58,108][INFO ][o.e.c.s.ClusterApplierService] [es-hay0-19] removed {{es-hay0-18}{ym_mc6WZTqaZrDTIweCgjA}{nE59gZFzS8mrsL3ga6qQqQ}{10.0.0.127}{10.0.0.127:9300}{rack_id=br1515, ml.machine_memory=42193956864, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true, gateway=true},}, reason: apply cluster state (from master [master {es-hay0-04}{Gmw2m6AyQ8WN05zXunQfng}{H3lc7EYAQKe0Eb105MaWIw}{10.0.0.113}{10.0.0.113:9300}{rack_id=br1517, ml.machine_memory=42193956864, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true, gateway=true} committed version [35669]])
A little earlier there are three of these
[2019-04-18T08:38:41,220][WARN ][o.e.c.r.a.AllocationService] [es-hay0-04] failing shard [failed shard, shard [dc-telegraf-logs-2019.04.18][3], node[ym_mc6WZTqaZrDTIweCgjA], [R], s[STARTED], a[id=l6EPHvWoQeWhj41WXxDwoQ], message [failed to perform indices:data/write/bulk[s] on replica [dc-telegraf-logs-2019.04.18][3], node[ym_mc6WZTqaZrDTIweCgjA], [R], s[STARTED], a[id=l6EPHvWoQeWhj41WXxDwoQ]], failure [RemoteTransportException[[es-hay0-18][10.0.0.127:9300][indices:data/write/bulk[s][r]]]; nested: AlreadyClosedException[[dc-telegraf-logs-2019.04.18][3] engine is closed]; nested: FileSystemException[/var/data/es-00/nodes/0/indices/ZjHR1lKVQAW-17SCPVffZA/3/index/_h89.fdx: Read-only file system]; ], markAsStale [true]]
org.elasticsearch.transport.RemoteTransportException: [es-hay0-18][10.0.0.127:9300][indices:data/write/bulk[s][r]]
Caused by: org.apache.lucene.store.AlreadyClosedException: [dc-telegraf-logs-2019.04.18][3] engine is closed
at org.elasticsearch.index.engine.Engine.ensureOpen(Engine.java:760) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.index.engine.Engine.ensureOpen(Engine.java:769) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:871) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:788) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:755) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnReplica(IndexShard.java:725) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:425) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:393) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:380) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:79) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncReplicaAction.onResponse(TransportReplicationAction.java:637) ~[elasticsearch-6.6.2.jar:6.6.2]
The stack trace continues... I can add more if needed.
Just to be clear, mounting with errors=panic?
is to good way to stop ES if there are filesystem issues?