Error when running multiple nodes on the same Vista machine


(talsalmona) #1

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(Berkay Mollamustafaoglu-2) #2

I'd suspect interference from one of Windows' overeager security features.
I'd stop firewall etc. and try with security features disabled.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Apr 11, 2010 at 1:00 PM, Tal talsalmona@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(Shay Banon) #3

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalmona@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1' -d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(talsalmona) #4

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d '{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(talsalmona) #5

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]]
[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:
410)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
468)
at org.elasticsearch.index.shard.recovery.RecoveryAction
$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
452)
at org.elasticsearch.transport.netty.MessageChannelHandler
$3.run(MessageChannelHandler.java:155)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(Shay Banon) #6

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalmona@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]]

[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

  1. at org.elasticsearch.index.shard.recovery.RecoveryAction
    $SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
  2. at org.elasticsearch.index.shard.recovery.RecoveryAction
    $SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
  3. at org.elasticsearch.transport.netty.MessageChannelHandler
    $3.run(MessageChannelHandler.java:155)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(talsalmona) #7

I suspected it had something to do with Netty.
Then it was trial and error.

I will be testing it on many computers in the coming days so I'll be
able to tell you if this solution works.

Thanks,
Tal

On Apr 12, 6:10 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalm...@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob] Received
shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/16.59.79.231:9300]]
into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300]][t1/1/
recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]]

[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED] Shard
not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException: [t1]
[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

  1. at org.elasticsearch.index.shard.recovery.RecoveryAction
    $SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
  2. at org.elasticsearch.index.shard.recovery.RecoveryAction
    $SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:
  3. at org.elasticsearch.transport.netty.MessageChannelHandler
    $3.run(MessageChannelHandler.java:155)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(Shay Banon) #8

I will change the default to false in any case. The funny thing is that I
had a look at other libs using netty and they defaulted it to true. Will
ping them as well...

cheers,
-shay.banon

On Mon, Apr 12, 2010 at 11:49 PM, Tal talsalmona@gmail.com wrote:

I suspected it had something to do with Netty.
Then it was trial and error.

I will be testing it on many computers in the coming days so I'll be
able to tell you if this solution works.

Thanks,
Tal

On Apr 12, 6:10 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Really? How did it occur to you that that might help (I would not have
thought of it, personally, but then again, I have come to expect the
unexpected on windows). Does this really help in a statistically
repeatably
manner? If so, I will default it to "false" when running on windows.

cheers,
shay.banon

On Mon, Apr 12, 2010 at 6:06 PM, Tal talsalm...@gmail.com wrote:

I added the following to my configuration file and it seems to work
now

transport:
netty:
reuse_address: false

On Apr 12, 10:38 am, Tal talsalm...@gmail.com wrote:

I tried on multiple Vista machines and this always happens

On Apr 12, 2:53 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is this re-creatable on the vista machines?

-shay.banon

On Sun, Apr 11, 2010 at 8:00 PM, Tal talsalm...@gmail.com wrote:

Hi,

I'm running two servers on the same Vista machine )-:
I index an object
curl -XPUT 'http://localhost:9200/t1/i1/1'-d'{user:"tal"}'

with one server running.
Then when I start the second server I get the errors below:

On the master I get this:
[19:50:50,247][WARN ][cluster.action.shard ] [Atom Bob]
Received

shard failed for [t1][1], Node[LH-JC0IOAYC2EEZ-12779], [B],
S[INITIALIZING], reason [Failed to start shard, message
[RecoveryFailedException[Index Shard [t1][1]: Recovery failed fro
m [Atom Bob][LH-JC0IOAYC2EEZ-31666][data][inet[/
16.59.79.231:9300]]

into [Power Man][LH-JC0IOAYC2EEZ-12779][data]
[inet[16.59.79.231/16.59.79.231:9300]]]; nested:
RemoteTransportException[[Atom Bob][inet[/16.59.79.231:9300
]][t1/1/

recovery/start]]; nes
ted: RecoveryEngineException[[t1][1] Phase[2] Execution failed];
nested: RemoteTransportException[[Atom Bob][inet[/
16.59.79.231:9300

]]

[t1/1/recovery/snapshot]]; nested:
IndexShardNotRecoveringException[[t1][1] CurrentState[STARTED]
Shard

not in recov
ering state]; ]]

On the slave I get this:
Caused by:
org.elasticsearch.index.shard.IndexShardNotRecoveringException:
[t1]

[3] CurrentState[STARTED] Shard not in recovering state
at

org.elasticsearch.index.shard.service.InternalIndexShard.performRecovery(InternalIndexShard.java:

  1. at org.elasticsearch.index.shard.recovery.RecoveryAction

$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

  1. at org.elasticsearch.index.shard.recovery.RecoveryAction

$SnapshotTransportRequestHandler.messageReceived(RecoveryAction.java:

  1. at org.elasticsearch.transport.netty.MessageChannelHandler
    $3.run(MessageChannelHandler.java:155)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor
    $Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

I'm using the default configuration
It works great on a linux box.
Any idea what's happening?

Thanks,
Tal


(system) #9