Hello,
I'd like to replace one of our cluster nodes as the machine can't handle
the load properly. Currently it's ES 0.19.11 on two Intel machines. We have
a T5220 SPARC that does nothing usefull, so I configured a ES master node
on it. The indices come from logstash-1.1.6-dev and graylog2 0.10.0rc1
(both) with default settings (5 shards, 1 replica - no templates or such).
I can start the ES node on the SPARC machine fine, it joins the cluster and
sits there doing nothing, but as soon I shutdown the to-be-replaced node
and it starts recovering, I get an exception and the JVM crashes. JVM
version is: Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
- SPARC
In the cluster node's logfile is:
[2012-12-19 10:49:08,300][INFO ][cluster.service ] [es-pheucd01]
removed {[es-phbuild02][hD05TiWaQISW6EjpjV_vfA][inet[/10.215.9.9:9300]],},
reason: zen-disco-receive(from master [[es-phewu01][bf8yaQ3CS6GQ4a4TDxQ8Uw
[inet[/10.215.9.10:9300]]])
[2012-12-19 10:50:11,610][WARN ][transport.netty ] [es-pheucd01]
Message not fully read (response) for [10716] handler
future(org.elasticsearch.indices.recovery.RecoveryTarget$3@722c174f), error
[true], resetting
[2012-12-19 10:50:11,644][WARN ][indices.cluster ] [es-pheucd01]
[logstash-weblogic-2012.12.13][4] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException:
[logstash-weblogic-2012.12.13][4]: Recovery failed from
[es-phewu01][bf8yaQ3CS6GQ4a4TDxQ8Uw][inet[/10.215.9.10:9300]] into
[es-pheucd01][vrpQaaQ3TEq8zB8YS2Ff0A][inet[/10.215.9.31:9300]]
at
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:293)
at
org.elasticsearch.indices.recovery.RecoveryTarget.access$100(RecoveryTarget.java:64)
at
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.elasticsearch.transport.RemoteTransportException: Failed to
deserialize exception response from stream
Caused by: org.elasticsearch.transport.TransportSerializationException:
Failed to deserialize exception response from stream
at
org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:171)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:565)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:793)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:565)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.StreamCorruptedException: unexpected end of block data
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1369)
Then the JVM crashes.
From hs_err_pid..:
Stack: [0xfffffff0ac500000,0xfffffff0ac580000], sp=0xfffffff0ac57dda0,
free space=503k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
J
org.apache.lucene.index.IndexFileNames.segmentFileName(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
j org.apache.lucene.index.TermVectorsTermsWriter.abort()V+62
j org.apache.lucene.index.TermsHash.abort()V+4
j org.apache.lucene.index.TermsHash.abort()V+31
j org.apache.lucene.index.DocInverter.abort()V+4
j org.apache.lucene.index.DocFieldProcessor.abort()V+24
j org.apache.lucene.index.DocumentsWriter.abort()V+173
j org.apache.lucene.index.IndexWriter.rollbackInternal()V+170
j org.apache.lucene.index.IndexWriter.rollback()V+12
j org.elasticsearch.index.engine.robin.RobinEngine.innerClose()V+65
j org.elasticsearch.index.engine.robin.RobinEngine.close()V+15
j
org.elasticsearch.index.service.InternalIndexService.deleteShard(IZZZLjava/lang/String;)V+362
j
org.elasticsearch.index.service.InternalIndexService.removeShard(ILjava/lang/String;)V+6
j
org.elasticsearch.indices.cluster.IndicesClusterStateService.handleRecoveryFailure(Lorg/elasticsearch/index/service/IndexService;Lorg/elasticsearch/cluster/routing/ShardRouting;ZLjava/lang
/Throwable;)V+109
j
org.elasticsearch.indices.cluster.IndicesClusterStateService.access$300(Lorg/elasticsearch/indices/cluster/IndicesClusterStateService;Lorg/elasticsearch/index/service/IndexService;Lorg/ela
sticsearch/cluster/routing/ShardRouting;ZLjava/lang/Throwable;)V+6
j
org.elasticsearch.indices.cluster.IndicesClusterStateService$PeerRecoveryListener.onRecoveryFailure(Lorg/elasticsearch/indices/recovery/RecoveryFailedException;Z)V+14
j
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(Lorg/elasticsearch/index/shard/service/InternalIndexShard;Lorg/elasticsearch/indices/recovery/StartRecoveryRequest;ZLorg/elasti
csearch/indices/recovery/RecoveryTarget$RecoveryListener;)V+947
j
org.elasticsearch.indices.recovery.RecoveryTarget.access$100(Lorg/elasticsearch/indices/recovery/RecoveryTarget;Lorg/elasticsearch/index/shard/service/InternalIndexShard;Lorg/elasticsearch
/indices/recovery/StartRecoveryRequest;ZLorg/elasticsearch/indices/recovery/RecoveryTarget$RecoveryListener;)V+6
j org.elasticsearch.indices.recovery.RecoveryTarget$2.run()V+20
J
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j java.lang.Thread.run()V+11
v ~StubRoutines::call_stub
V [libjvm.so+0x21dcec] void
JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)+0x37c
V [libjvm.so+0x74fd04] void
JavaCalls::call_virtual(JavaValue*,Handle,KlassHandle,Symbol*,Symbol*,Thread*)+0x1ac
V [libjvm.so+0x2d351c] void thread_entry(JavaThread*,Thread*)+0x15c
V [libjvm.so+0xb720c8] void JavaThread::thread_main_inner()+0x88
V [libjvm.so+0x2cedc4] void JavaThread::run()+0x3a4
V [libjvm.so+0xa45ccc] java_start+0x364
Can anybody see the cause of this? I'd be glad to provide more info and
full logfile/crash report if it helps.
Best regards,
Thomas
--