Exceptions in the log in ElasticSearch to 5.6.9

sss · June 26, 2018, 8:50am

I have a single node cluster deployment on Elasticsearch 5.6.9..

During and After the ES startup, i continuously see a lot of these messages in the logs.

Any help how to debug this further ?

2018-06-26 08:23:39,623 [orker.default]][T#2]] [ ] [ ] [ ] (ansport.netty4.Netty4Transport) WARN - exception caught on transport layer [[id: 0x7e1b44e8, L:0.0.0.0/0.0.0.0:9300]], closing connection
io.netty.handler.codec.DecoderException: java.io.StreamCorruptedException: invalid internal transport message format, got (16,3,3,0)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459) ~[netty-codec-4.1.13.Final.jar:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:392) ~[netty-codec-4.1.13.Final.jar:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:359) ~[netty-codec-4.1.13.Final.jar:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:167) ~[netty-handler-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1329) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:908) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:744) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.13.Final.jar:?]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) ~[netty-common-4.1.13.Final.jar:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462) ~[netty-transport-4.1.13.Final.jar:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[netty-common-4.1.13.Final.jar:?]
at java.lang.Thread.run(Thread.java:811) [?:2.9 (11-22-2017)]
Caused by: java.io.StreamCorruptedException: invalid internal transport message format, got (16,3,3,0)
at org.elasticsearch.transport.TcpTransport.validateMessageHeader(TcpTransport.java:1293) ~[elasticsearch-5.6.9.jar:?]
at org.elasticsearch.transport.netty4.Netty4SizeHeaderFrameDecoder.decode(Netty4SizeHeaderFrameDecoder.java:36) ~[transport-netty4-client-5.6.9.jar:?]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[netty-codec-4.1.13.Final.jar:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[netty-codec-4.1.13.Final.jar:?]
... 24 more

dadoonet · June 26, 2018, 9:08am

My guess is that you have an older version running somewhere which tries to connect to this cluster or you are using an old Java TransportClient.

sss · June 26, 2018, 9:57am

Hi David,

i am quite sure that neither the older version not transport client is an issue.

Also, i see the below exception wherein ES is not able to assign shards for on the indices. Is there any limits that are suggested by ES wrt virtual address memory space usage in 5.x ? I am able to get this working when we were on 2.4.6 version of ES

2018-06-26 09:07:55,073 [ice#updateTask][T#1]] [ ] [ ] [ ] (cluster.service.ClusterService) TRACE - cluster state updated, source [cluster_update_settings]
cluster uuid: dOs6CYDwSTynjVWroC6xrA
version: 80
state uuid: _x4NcHFhROWKdFgYDUvXag
from_diff: false
meta data version: 76
[help/aJrEXiVsSEqEb1SJ29fniw]: v[11]
0: p_term [5], isa_ids [FTDXEc2PTgeTkrQsd8LSrQ]
[data/f4XPISkZSi-5ZR6EP_E3Xg]: v[92]
0: p_term [5], isa_ids [AiXjk7wRS5y9jL23oveavA]
[rule/JQaBovrPShqvLUjN7Riodw]: v[561]
0: p_term [20], isa_ids [7wsrtbE8TraUi7ojm45KtQ]
[work/T1C0eAMfSq2jutByJaH2uw]: v[11]
0: p_term [5], isa_ids [5n-wXMvQTHWe06vwM-MHhg]
nodes:
{4A55AF6DE60789BD3686EBD870014F78}{2erYQJHXQHe4AUIccdNacw}{JToVWFIEQvqU4JA8IWrrmQ}{10.1.99.151}{10.1.99.151:9300}, local, master
routing_table (version 21):
-- index [[help/aJrEXiVsSEqEb1SJ29fniw]]
----shard_id [help][0]
--------[help][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=FTDXEc2PTgeTkrQsd8LSrQ]

-- index [[data/f4XPISkZSi-5ZR6EP_E3Xg]]
----shard_id [data][0]
--------[data][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=AiXjk7wRS5y9jL23oveavA]

-- index [[work/T1C0eAMfSq2jutByJaH2uw]]
----shard_id [work][0]
--------[work][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=5n-wXMvQTHWe06vwM-MHhg]

-- index [[rule/JQaBovrPShqvLUjN7Riodw]]
----shard_id [rule][0]
--------[rule][0], node[null], [P], recovery_source[existing recovery], s[UNASSIGNED], unassigned_info[[reason=ALLOCATION_FAILED], at[2018-06-26T07:26:53.165Z], failed_attempts[5], delayed=false, details[failed recovery, failure RecoveryFailedException[[rule][0]: Recovery failed on {4A55AF6DE60789BD3686EBD870014F78}{2erYQJHXQHe4AUIccdNacw}{JToVWFIEQvqU4JA8IWrrmQ}{10.1.99.151}{10.1.99.151:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: EngineCreationFailureException[failed to open reader on writer]; nested: IOException[EDC5124I Too many open files. (errno2=0x07360344): MMapIndexInput(path="/lbcell3/WASL023/temp/SearchIndex/nodes/0/indices/JQaBovrPShqvLUjN7Riodw/0/index/_1b.cfs") [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 68601012 bytes. Please review 'ulimit -v', 'ulimit -m' (both should return 'unlimited'). More information: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]]; ], allocation_status[deciders_no]]

routing_nodes:
-----node_id[2erYQJHXQHe4AUIccdNacw][V]
--------[help][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=FTDXEc2PTgeTkrQsd8LSrQ]
--------[data][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=AiXjk7wRS5y9jL23oveavA]
--------[work][0], node[2erYQJHXQHe4AUIccdNacw], [P], s[STARTED], a[id=5n-wXMvQTHWe06vwM-MHhg]
---- unassigned
--------[rule][0], node[null], [P], recovery_source[existing recovery], s[UNASSIGNED], unassigned_info[[reason=ALLOCATION_FAILED], at[2018-06-26T07:26:53.165Z], failed_attempts[5], delayed=false, details[failed recovery, failure RecoveryFailedException[[rule][0]: Recovery failed on {4A55AF6DE60789BD3686EBD870014F78}{2erYQJHXQHe4AUIccdNacw}{JToVWFIEQvqU4JA8IWrrmQ}{10.1.99.151}{10.1.99.151:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: EngineCreationFailureException[failed to open reader on writer]; nested: IOException[EDC5124I Too many open files. (errno2=0x07360344): MMapIndexInput(path="/lbcell3/WASL023/temp/SearchIndex/nodes/0/indices/JQaBovrPShqvLUjN7Riodw/0/index/_1b.cfs") [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 68601012 bytes. Please review 'ulimit -v', 'ulimit -m' (both should return 'unlimited'). More information: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]]; ], allocation_status[deciders_no]]

dadoonet · June 26, 2018, 2:03pm

This:

2018-06-26 08:23:39,623 [orker.default]][T#2]] [ ] [ ] [ ] (ansport.netty4.Netty4Transport) WARN - exception caught on transport layer [[id: 0x7e1b44e8, L:0.0.0.0/0.0.0.0:9300]], closing connection
io.netty.handler.codec.DecoderException: java.io.StreamCorruptedException: invalid internal transport message format, got (16,3,3,0)

Is totally unrelated IMO to:

{10.1.99.151}{10.1.99.151:9300}]; nested: IndexShardRecoveryException[failed to recover from gateway]; nested: EngineCreationFailureException[failed to open reader on writer]; nested: IOException[EDC5124I Too many open files. (errno2=0x07360344): MMapIndexInput(path="/lbcell3/WASL023/temp/SearchIndex/nodes/0/indices/JQaBovrPShqvLUjN7Riodw/0/index/_1b.cfs") [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 68601012 bytes. Please review 'ulimit -v', 'ulimit -m' (both should return 'unlimited'). More information: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]]; ], allocation_status[deciders_no]]

Here you have most likely a node with too many shards on it. That's just a guess.
What is the output of:

GET _cat/health?v
GET _cat/indices?v
GET _cat/nodes?v

system · July 24, 2018, 2:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
I get an exception after enabling transport layer security . Can someone help me here? Elasticsearch elastic-stack-security	2	675	October 24, 2020
Java.io.StreamCorruptedException: invalid internal transport message format, got (3,41,4d,52) Elasticsearch	4	12105	July 6, 2017
Failing starting due to transport layer exception Elasticsearch	9	3709	July 6, 2017
java.io.StreamCorruptedException: invalid internal transport message format Elasticsearch	7	609	May 22, 2021
ES upgrade Elasticsearch	6	402	July 6, 2017

Exceptions in the log in ElasticSearch to 5.6.9

Related topics