Hi,
I've setup kubernetes on baremetal and try to run a three node cluster there. In my dev environment I am currently using glusterfs which also runs in kubernetes for storage.
I am getting these kinds of errors:
"type":"server",
"timestamp":"2019-07-11T13:03:12,139+0000",
"level":"WARN",
"component":"o.e.c.r.a.AllocationService",
"cluster.name":"poc",
"node.name":"poc-es-master-1",
"cluster.uuid":"dXOiKR5_Qsu1ZSSsQ8-8qw",
"node.id":"rQnFrxrFSsiXgGmshVtGGg",
"message":"failing shard [failed shard, shard [plx_session-2019.w28][0], node[rQnFrxrFSsiXgGmshVtGGg], [R], s[STARTED], a[id=GbmKjl78SnS7IUiM-G23-Q], message [failed to perform indices:data/write/bulk[s] on replica [plx_session-2019.w28][0], node[rQnFrxrFSsiXgGmshVtGGg], [R], s[STARTED], a[id=GbmKjl78SnS7IUiM-G23-Q]], failure [RemoteTransportException[[poc-es-master-1][192.168.198.184:9300][indices:data/write/bulk[s][r]]]; nested: AlreadyClosedException[translog is already closed]; ], markAsStale [true]]"
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [poc-es-master-1][192.168.198.184:9300][indices:data/write/bulk[s][r]]",
"Caused by: org.apache.lucene.store.AlreadyClosedException: translog is already closed",
"at org.elasticsearch.index.translog.Translog.ensureOpen(Translog.java:1778) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.index.translog.Translog.add(Translog.java:535) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:872) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:789) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:762) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnReplica(IndexShard.java:726) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:416) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:386) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:373) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnReplica(TransportShardBulkAction.java:79) ~[elasticsearch-7.1.1.jar:7.1.1]",
...
Here at paste.ee I've placed logfile and kubernetes statefulset configuration I am using.
Is this issue triggert because of the underlying storage provider glusterFs, or is something misconfigured in my es-cluster which has nothing to do with the underlying storage?
Thanks,
Andreas