Need help in v7.9.2 cluster configuration of 3 nodes

Hello,

I'm setting up a cluster with elasticsearch v7.9.2 and for that I'm setting up a 3 node elasticsearch cluster environment on separate VM's. On performing search I'm seeing IllegalStateException in the logs. I need some help in checking what changes would be required in the below elasticsearch.yml configuration of respective nodes,

Node 1:

cluster.name: es792
node.name: 10.123.234.1
node.master: true
node.data: true
network.host: 10.123.234.1
http.port: 9200
transport.bind_host: 10.123.234.1
transport.tcp.port: 9300
discovery.seed_hosts: ["10.123.234.1","10.123.234.2","10.123.234.3"]
cluster.initial_master_nodes: [10.123.234.1]
xpack.ml.enabled: false
xpack.security.enabled: false

Node 2:

cluster.name: es792
node.name: 10.123.234.3
node.master: true
node.data: true
network.host: 10.123.234.3
http.port: 9200
transport.bind_host: 10.123.234.3
transport.tcp.port: 9300
discovery.seed_hosts: ["10.123.234.1","10.123.234.2","10.123.234.3"]
xpack.ml.enabled: false
xpack.security.enabled: false

Node 3:

cluster.name: es792
node.name: 10.123.234.2
node.master: true
node.data: true
network.host: 10.123.234.2
http.port: 9200
transport.bind_host: 10.123.234.2
transport.tcp.port: 9300
discovery.seed_hosts: ["10.123.234.1","10.123.234.2","10.123.234.3"]
xpack.ml.enabled: false
xpack.security.enabled: false

Observing the below exceptions,

'org.elasticsearch.transport.RemoteTransportException: [10.123.234.1][10.123.234.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 1 does not match current term 2
        at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:225) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:1013) ~[elasticsearch-7.9.2.jar:7.9.2]
        at java.util.Optional.ifPresent(Optional.java:183) ~[?:?]


Exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.2:9300, remoteAddress=/10.123.234.1:61631}], closing connection
java.lang.IllegalStateException: transport not ready yet to handle incoming requests
        at org.elasticsearch.transport.TransportService.onRequestReceived(TransportService.java:943) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:136) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHa
	


[2020-10-15T01:06:03,576][INFO ][o.e.n.Node               ] [10.123.234.1] started
[2020-10-15T01:07:27,857][WARN ][o.e.t.TcpTransport       ] [10.123.234.1] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.1:63544, remoteAddress=10.123.234.3/10.123.234.3:9300}], closing connection
java.lang.IllegalStateException: Message not fully read (response) for requestId [225], handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler/org.elasticsearch.transport.TransportService$6@2d61b747], error [false]; resetting
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:124) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.2.jar:7.9.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]


2020-10-15T01:07:27,857][WARN ][o.e.t.TcpTransport       ] [10.123.234.1] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.1:63548, remoteAddress=10.123.234.2/10.123.234.2:9300}], closing connection
java.lang.IllegalStateException: Message not fully read (response) for requestId [229], handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler/org.elasticsearch.transport.TransportService$6@7646f4d6], error [false]; resetting
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:124) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.2.jar:7.9.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
		

[2020-10-15T01:07:27,873][WARN ][o.e.t.TcpTransport       ] [10.123.234.1] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.1:63546, remoteAddress=10.123.234.2/10.123.234.2:9300}], closing connection
java.lang.IllegalStateException: Message not fully read (response) for requestId [226], handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler/org.elasticsearch.transport.TransportService$6@6d5264b6], error [false]; resetting
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:124) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.2.jar:7.9.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
1 Like

That normally indicates a bug. Would you use tcpdump to capture the network traffic on this node and share it with me? We'd need to see the corresponding logs too to pinpoint the problematic message. I can set up a private upload link if you'd rather not share this in public.

Thanks David.

Unfortunately we can't use tcpdump in our secured environment. I'm checking with IT if there is any alternative. Now, I'll be able to share the debugs logs that we captured, can you please share an upload link?

Here is another exception that we see consistently in our setup,

[2020-10-15T16:07:27,908][WARN ][o.e.t.TcpTransport       ] [10.123.234.2] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.2:9300, remoteAddress=/10.123.234.1:49996}], closing connection
java.lang.IllegalStateException: transport not ready yet to handle incoming requests
        at org.elasticsearch.transport.TransportService.onRequestReceived(TransportService.java:943) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:136) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:93) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.2.jar:7.9.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at java.lang.Thread.run(Thread.java:832) [?:?]
[2020-10-15T16:07:27,954][WARN ][o.e.t.TcpTransport       ] [10.123.234.2] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.123.234.2:9300, remoteAddress=/10.123.234.1:49998}], closing connection
java.lang.IllegalStateException: transport not ready yet to handle incoming requests
        at org.elasticsearch.transport.TransportService.onRequestReceived(TransportService.java:943) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:136) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:93) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:78) ~[elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:692) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:76) [transport-netty4-client-7.9.2.jar:7.9.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
        at java.lang.Thread.run(Thread.java:832) [?:?]

Without the network capture the logs aren't much help I'm afraid. As an alternative, try these trace loggers:

logger.org.elasticsearch.transport.TransportLogger: TRACE
logger.org.elasticsearch.transport.netty4.ESLoggingHandler: TRACE

Unfortunately this will produce rather a lot of logs, and doesn't always output the information we need, but it might work. You only need keep these loggers enabled until you see a Message not fully read message.

Do you get transport not ready yet to handle incoming requests on an ongoing basis or is it just when the node is starting up?

David,

I want to share you the trace logs on this issue, can you share me a location where I can upload them?

Sure, I've sent you a link in a private message.

We modified our elasticsearch configuration slightly from what I have shared above. And in this cluster setup we deploy a custom plugin to support search in our application and that's upgraded from ES v6.8.3 to ES v7.9.2. Now when we hit search we are seeing exceptions on elasticsearch nodes. There is no response and the nodes are going down in the cluster.

Is there a limit on the file size that we can upload in the link you provided?

In the uploaded files, I have included all the config files and logs from the respective nodes.

This looks like a bug, but not in Elasticsearch itself. The broken message has type indices:data/read/join-search/filterquery[s] which isn't a native Elasticsearch message so it must be coming from a plugin or other modification. The logs indicate:

[2020-10-21T12:59:22,404][INFO ][o.e.p.PluginsService     ] [10.125.247.49] loaded plugin [join-search-plugin]

I suspect removing this plugin will fix things.


I don't know if there's a size limit on uploads to the link I shared; if there is one it's at least 32GB since we use this service for heap dumps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.