Error when running multiple nodes on the same Vista machine

talsalmona · April 11, 2010, 5:00pm

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

Berkay_Mollamustafao · April 11, 2010, 5:16pm

I'd suspect interference from one of Windows' overeager security features.
I'd stop firewall etc. and try with security features disabled.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Apr 11, 2010 at 1:00 PM, Tal talsalmona@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

kimchy · April 11, 2010, 11:53pm

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalmona@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

talsalmona · April 12, 2010, 7:38am

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

talsalmona · April 12, 2010, 3:06pm

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

kimchy · April 12, 2010, 3:10pm

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalmona@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

talsalmona · April 12, 2010, 8:49pm

I suspected it had something to do with Netty.
Then it was trial and error.

I will be testing it on many computers in the coming days so I'll be
able to tell you if this solution works.

Thanks,
Tal

On Apr 12, 6:10 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalm...@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

kimchy · April 13, 2010, 12:06am

I will change the default to false in any case. The funny thing is that I
had a look at other libs using netty and they defaulted it to true. Will
ping them as well...

cheers,
-shay.banon

On Mon, Apr 12, 2010 at 11:49 PM, Tal talsalmona@gmail.com wrote:

I suspected it had something to do with Netty.
Then it was trial and error.

I will be testing it on many computers in the coming days so I'll be
able to tell you if this solution works.

Thanks,
Tal

On Apr 12, 6:10 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically
repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalm...@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob]
Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/
16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/
16.59.79.231:9300
]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED]
Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException:
[t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction

$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.index.shard.recovery.RecoveryAction

$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal

Topic		Replies	Views
Recovery failed for shard Elasticsearch	1	505	July 6, 2017
About org.elasticsearch.indices.recovery.RecoveryFailedException error Elasticsearch	1	1578	July 6, 2017
Sending failed shard error Elasticsearch	14	832	July 6, 2017
Shard failure after restart of node - ES 1.7.5 Elasticsearch	7	671	July 5, 2017
Failed Shard Recovery Elasticsearch	5	3182	July 6, 2017

Error when running multiple nodes on the same Vista machine

Related topics