continue with the previous issue on GitHub,I found that this problem mostly occurs on one index. An index often shows "failed to obtain in-memory shard lock", resulting in shard unassignment. Both primary and replica shards have this problem, leading to a red cluster. Could it be related to the failure of data serialization of this index? Also, slow queries often occur. There are several slow queries of around 200ms in one second. Nodes are always disconnected and then replica shards will be reassigned. Is it related to these?
Which version of Elasticsearch are you using?
What is the size and specification of your cluster with respect to node count, CPU, RAM, heap and type of storage used?
How much data in terms of volume and shards does the cluster hold? What kind of load is the cluster under?
Are there any errors in the Elasticsearch logs? If so, please share the full error messages as well as some context around them.
6.2.2 version
I often encounter this problem. The version is 6.2.2 and after obtaining a timeout for the shard lock, using curl -XPOST 'localhost:9200/_cluster/reroute?Retrying the command with "retry_railed=true" has no effect. You need to restart the disconnected node before executing this command to recover. The same situation will occur every few hours, which is very frustrating. The load and CPU are not very high, and the CPU does not reach 100%
`[2024-08-09T11:52:10,330][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [node-name-1] failed to execute on node [2Tkf6jv_RHeK4UlO-PBVLQ]
org.elasticsearch.transport.RemoteTransportException: [Failed to deserialize response from handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler]]
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize response from handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler]
at org.elasticsearch.transport.TcpTransport.handleResponse(TcpTransport.java:1441) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1400) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:64) [transport-netty4-6.2.2.jar:6.2.2]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
Caused by: java.io.IOException: Invalid string; unexpected character: 241 hex: f1
at org.elasticsearch.common.io.stream.StreamInput.readString(StreamInput.java:375) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.io.stream.StreamInput.readMap(StreamInput.java:459) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.node.AdaptiveSelectionStats.(AdaptiveSelectionStats.java:56) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.io.stream.StreamInput.readOptionalWriteable(StreamInput.java:733) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.admin.cluster.node.stats.NodeStats.readFrom(NodeStats.java:239) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportResponseHandler.read(TransportResponseHandler.java:47) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.read(TransportService.java:1085) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.handleResponse(TcpTransport.java:1437) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1400) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:64) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_221]
[2024-08-09T11:52:10,338][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] exception caught on transport layer [NettyTcpChannel{localAddress=/172.30.240.157:17868, remoteAddress=172.30.240.159/172.30.240.159:29300}], closing connection
java.lang.IllegalStateException: Message not fully read (response) for requestId [15232515], handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler/org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1@2e2334a6], error [false]; resetting
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1407) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:64) ~[transport-netty4-6.2.2.jar:6.2.2]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) [netty-codec-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:52:10,410][WARN ][o.e.a.b.TransportShardBulkAction] [node-name-1] [[d_game_virtual_machine_info.t_game_virtual_machine_info][3]] failed to perform indices:data/write/bulk[s] on replica [d_game_virtual_machine_info.t_game_virtual_machine_info][3], node[2Tkf6jv_RHeK4UlO-PBVLQ], [R], s[STARTED], a[id=1GFbRL_xQ5u2c3kcUemCbQ]
org.elasticsearch.transport.NodeNotConnectedException: [node-name-3][172.30.240.159:29300] Node not connected
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:693) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:122) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:529) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:505) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.sendReplicaRequest(TransportReplicationAction.java:1189) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicasProxy.performOn(TransportReplicationAction.java:1153) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplica(ReplicationOperation.java:170) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplicas(ReplicationOperation.java:154) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:121) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:359) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:299) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:975) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:972) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2220) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:975) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:972) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2220) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:984) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:98) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:320) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:295) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:282) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:52:10,442][WARN ][o.e.a.b.TransportShardBulkAction] [node-name-1] [[d_area_user_storage_index_info.t_area_user_game_data_schedule_active_info][2]] failed to perform indices:data/write/bulk[s] on replica [d_area_user_storage_index_info.t_area_user_game_data_schedule_active_info][2], node[2Tkf6jv_RHeK4UlO-PBVLQ], [R], s[STARTED], a[id=5lR4gCsEQ52Fu1JIlTLdLg]
org.elasticsearch.transport.NodeNotConnectedException: [node-name-3][172.30.240.159:29300] Node not connected
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:693) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:122) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:529) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:505) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.sendReplicaRequest(TransportReplicationAction.java:1189) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicasProxy.performOn(TransportReplicationAction.java:1153) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplica(ReplicationOperation.java:170) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplicas(ReplicationOperation.java:154) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:121) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:359) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:299) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicasProxy.performOn(TransportReplicationAction.java:1153) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplica(ReplicationOperation.java:170) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplicas(ReplicationOperation.java:154) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:121) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:359) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:299) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:975) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:972) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2220) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:984) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:98) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:320) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:295) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:282) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:52:10,443][WARN ][o.e.a.b.TransportShardBulkAction] [node-name-1] [[d_area_user_storage_index_info.t_area_user_storage_index_info][3]] failed to perform indices:data/write/bulk[s] on replica [d_area_user_storage_index_info.t_area_user_storage_index_info][3], node[2Tkf6jv_RHeK4UlO-PBVLQ], [R], s[STARTED], a[id=4GhR02DJTaKz9i06Cvd_Fg]
org.elasticsearch.transport.NodeNotConnectedException: [node-name-3][172.30.240.159:29300] Node not connected
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:693) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:122) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:529) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:505) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.sendReplicaRequest(TransportReplicationAction.java:1189) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:299) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:975) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:972) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2220) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:984) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:98) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:320) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:295) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:282) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:53:00,254][WARN ][o.e.a.b.TransportShardBulkAction] [node-name-1] [[d_area_user_storage_index_info.t_path_mapping_info][1]] failed to perform indices:data/write/bulk[s] on replica [d_area_user_storage_index_info.t_path_mapping_info][1], node[2Tkf6jv_RHeK4UlO-PBVLQ], [R], s[STARTED], a[id=lGZQkGQZSDWMPyYwD-4gvQ]
org.elasticsearch.transport.RemoteTransportException: [node-name-3][172.30.240.159:29300][indices:data/write/bulk[s][r]]
Caused by: java.lang.IllegalStateException: active primary shard [d_area_user_storage_index_info.t_path_mapping_info][1], node[2Tkf6jv_RHeK4UlO-PBVLQ], [P], s[STARTED], a[id=lGZQkGQZSDWMPyYwD-4gvQ] cannot be a replication target before relocation hand off, state is [STARTED]
at org.elasticsearch.index.shard.IndexShard.verifyReplicationTarget(IndexShard.java:1508) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.index.shard.IndexShard.acquireReplicaOperationPermit(IndexShard.java:2240) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncReplicaAction.doRun(TransportReplicationAction.java:641) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicaOperationTransportHandler.messageReceived(TransportReplicationAction.java:513) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicaOperationTransportHandler.messageReceived(TransportReplicationAction.java:493) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.2.jar:6.2.2]
~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicaOperationTransportHandler.messageReceived(TransportReplicationAction.java:493) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:53:00,338][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:29300, remoteAddress=/172.30.240.77:37909}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2024-08-09T11:53:00,423][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:29300, remoteAddress=/172.30.240.77:37934}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2024-08-09T11:53:00,825][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:29300, remoteAddress=/172.30.240.159:61112}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2024-08-09T11:53:00,838][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:29300, remoteAddress=/172.30.240.159:61084}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2024-08-09T11:53:00,884][WARN ][o.e.t.n.Netty4Transport ] [node-name-1] send message failed [channel: NettyTcpChannel{localAddress=/172.30.240.157:29300, remoteAddress=/172.30.240.158:47740}]
java.nio.channels.ClosedChannelException: null
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]
[2024-08-09T11:53:00,900][WARN ][o.e.c.a.s.ShardStateAction] [node-name-1] [d_area_user_storage_index_info.t_path_mapping_info][1] unexpected failure while sending request [internal:cluster/shard/failure] to [{node-name-6}{BYAJrw41RROS-dgrqiYfkw}{qfvfqkmPQx26lXHCKXkBKg}{172.30.240.79}{172.30.240.79:29300}{rack=node-rack-6}] for shard entry [shard id [[d_area_user_storage_index_info.t_path_mapping_info][1]], allocation id [lGZQkGQZSDWMPyYwD-4gvQ], primary term [40], message [failed to perform indices:data/write/bulk[s] on replica [d_area_user_storage_index_info.t_path_mapping_info][1], node[2Tkf6jv_RHeK4UlO-PBVLQ], [R], s[STARTED], a[id=lGZQkGQZSDWMPyYwD-4gvQ]], failure [RemoteTransportException[[node-name-3][172.30.240.159:29300][indices:data/write/bulk[s][r]]]; nested: IllegalStateException[active primary shard [d_area_user_storage_index_info.t_path_mapping_info][1], node[2Tkf6jv_RHeK4UlO-PBVLQ], [P], s[STARTED], a[id=lGZQkGQZSDWMPyYwD-4gvQ] cannot be a replication target before relocation hand off, state is [STARTED]]; ]]
org.elasticsearch.transport.RemoteTransportException: [node-name-6][172.30.240.79:29300][internal:cluster/shard/failure]
Caused by: org.elasticsearch.cluster.action.shard.ShardStateAction$NoLongerPrimaryShardException: primary term [40] did not match current primary term [41]
at org.elasticsearch.cluster.action.shard.ShardStateAction$ShardFailedClusterStateTaskExecutor.execute(ShardStateAction.java:291) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:643) ~[elasticsearch-6.2.2.jar:6.2.2]
used by: org.elasticsearch.cluster.action.shard.ShardStateAction$NoLongerPrimaryShardException: primary term [40] did not match current primary term [41]
at org.elasticsearch.cluster.action.shard.ShardStateAction$ShardFailedClusterStateTaskExecutor.execute(ShardStateAction.java:291) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:643) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:273) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:198) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:53:00,916][WARN ][o.e.i.e.Engine ] [node-name-1] [d_area_user_storage_index_info.t_path_mapping_info][1] failed engine [primary shard [[d_area_user_storage_index_info.t_path_mapping_info][1], node[MHZ2QDRvRi2oHnuURklZVA], [P], s[STARTED], a[id=IjMBhX4dQgaQIm_C6sd06g]] was demoted while failing replica shard]
org.elasticsearch.cluster.action.shard.ShardStateAction$NoLongerPrimaryShardException: primary term [40] did not match current primary term [41]
at org.elasticsearch.cluster.action.shard.ShardStateAction$ShardFailedClusterStateTaskExecutor.execute(ShardStateAction.java:291) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:643) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:273) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:198) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
[2024-08-09T11:53:00,924][WARN ][o.e.i.c.IndicesClusterStateService] [node-name-1] [[d_area_user_storage_index_info.t_path_mapping_info][1]] marking and sending shard failed due to [shard failure, reason [primary shard [[d_area_user_storage_index_info.t_path_mapping_info][1], node[MHZ2QDRvRi2oHnuURklZVA], [P], s[STARTED], a[id=IjMBhX4dQgaQIm_C6sd06g]] was demoted while failing replica shard]]
org.elasticsearch.cluster.action.shard.ShardStateAction$NoLongerPrimaryShardException: primary term [40] did not match current primary term [41]
at org.elasticsearch.cluster.action.shard.ShardStateAction$ShardFailedClusterStateTaskExecutor.execute(ShardStateAction.java:291) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:643) ~[elasticsearch-6.2.2.jar:6.2.2]
2024-08-09T11:53:01,582][INFO ][o.e.d.z.ZenDiscovery ] [node-name-1] master_left [{node-name-6}{BYAJrw41RROS-dgrqiYfkw}{qfvfqkmPQx26lXHCKXkBKg}{172.30.240.79}{172.30.240.79:29300}{rack=node-rack-6}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2024-08-09T11:53:01,583][WARN ][o.e.d.z.ZenDiscovery ] [node-name-1] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout), current nodes: nodes:
{node-name-1}{MHZ2QDRvRi2oHnuURklZVA}{2LvDCi43TXq2lAVZv3GtWw}{172.30.240.157}{172.30.240.157:29300}{rack=node-rack-1}, local
{node-name-6}{BYAJrw41RROS-dgrqiYfkw}{qfvfqkmPQx26lXHCKXkBKg}{172.30.240.79}{172.30.240.79:29300}{rack=node-rack-6}, master
{node-name-2}{xpeI6kaPROa6n0fNw31xHQ}{8xrzrb9nQBKAv3L9IXWvag}{172.30.240.158}{172.30.240.158:29300}{rack=node-rack-2}
{node-name-4}{ZSq18yOnT6i-AGM312NJtQ}{ium7p8ZiS0WRh_QHFVHHkg}{172.30.240.77}{172.30.240.77:29300}{rack=node-rack-4}
{node-name-3}{2Tkf6jv_RHeK4UlO-PBVLQ}{TH8J2sEkQ8WSXBq0Q-qlzA}{172.30.240.159}{172.30.240.159:29300}{rack=node-rack-3}
{node-name-5}{j79TJpjhSI-PvT_5_lHFKg}{j2SDffAIQo6ezEacrSjmIg}{172.30.240.78}{172.30.240.78:29300}{rack=node-rack-5}
[2024-08-09T11:53:04,630][INFO ][o.e.c.s.ClusterApplierService] [node-name-1] detected_master {node-name-6}{BYAJrw41RROS-dgrqiYfkw}{qfvfqkmPQx26lXHCKXkBKg}{172.30.240.79}{172.30.240.79:29300}{rack=node-rack-6}, reason: apply cluster state (from master [master {node-name-6}{BYAJrw41RROS-dgrqiYfkw}{qfvfqkmPQx26lXHCKXkBKg}{172.30.240.79}{172.30.240.79:29300}{rack=node-rack-6} committed version [3074]])
[2024-08-09T11:53:05,996][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [node-name-1] no known master node, scheduling a retry
[2024-08-09T11:53:06,877][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [node-name-1] no known master node, scheduling a retry
[2024-08-09T11:53:12,684][WARN ][o.e.i.c.IndicesClusterStateService] [node-name-1] [[d_area_user_storage_index_info.t_area_user_game_data_schedule_active_info][2]] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
at org.elasticsearch.index.IndexService.createShard(IndexService.java:392) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_221]
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [elasticsearch-6.2.2.jar:6.2.2]
--------
cpu load RAM heap all are all not high,
![image|690x333](upload://fcQRVmx6OBqRlYBiJYhUHfrttvz.jpeg)
![image|690x166](upload://ohsE20fP8IZGriQ8YXqJEbQ4ELd.jpeg)
The cluster has six nodes, including three master nodes and three data nodes.The garbage collection (GC) has also been optimized. The size of the new generation JVM has been optimized. Now, GC is not frequent anymore.
This version was released 6½ years ago, and I'm pretty sure you're hitting a bug that has long since been fixed. I know of no workaround in this version, you need to upgrade to a supported version ASAP.
Thanks,which compatible version is recommended for upgrade? Versions 7.x and 6.x should be incompatible, right?
IMHO the best thing to do is to reindex into a new cluster.
Otherwise you need to upgrade to 6.8, then run the upgrade assistant, fix what is needed to be fixed, upgrade to 7.17.
OK,I want to confirm if this bug has been fixed in version 6.8,if so, I will upgrade to 6.8
As far as I can tell it is not clear exactly what the issue and since it is such an old version it is unlikely someone will spend a lot of time troubleshooting it in order to find the oldest version that has this fixed. I would therefore recommend you set up a new cluster with the newest version possible and reindex remotely from the original cluster.
++ to what Christain said. Even 6.8 passed EOL years ago and is irresponsibly old. There have been some bugfixes in this area even in the 8.x series, we recommend using the latest version (8.15.0 at time of writing) to make sure you don't hit any known issues.
Thank you very much