River Failover


(Michel Conrad) #1

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Shay Banon) #2

The river should be restarted on another node. Maybe you can recreate and explain the states that it happens?
On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #3

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Shay Banon) #4

No, defaults should work and support for river failover... . When you have more info, ping back.
On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #5

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Shay Banon) #6

When you say delete all storage folders for the database, what database is it? Also, which river are you using, which thread failure are you referring to (which implementation)? Is that a single elasticsearch node scenario now?
On Tuesday, April 12, 2011 at 1:55 PM, Michel Conrad wrote:

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #7

I have a 5 server scenario to test with, am using a river I wrote to
import data from a cassandra database and I currently use
elasticsearch 0.15.2.

When I mean reset the database, I am deleting the folders where
elasticsearch stores its data (I am using a filebased storage type at
the moment).

After restarting the cluster, the health goes to green, but the river
is not being started on any node.

On Tue, Apr 12, 2011 at 1:00 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

When you say delete all storage folders for the database, what database is
it? Also, which river are you using, which thread failure are you referring
to (which implementation)? Is that a single elasticsearch node scenario now?

On Tuesday, April 12, 2011 at 1:55 PM, Michel Conrad wrote:

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #8

Here is some additional information from the logfile:

[2011-04-12 14:33:52,321][WARN ][river ] [Smartship
Friday] failed to get _meta from [trendiction]/[dsearch_all]
org.elasticsearch.cluster.block.ClusterBlockException: blocked by:
[1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedException(ClusterBlocks.java:131)
at org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedRaiseException(ClusterBlocks.java:115)
at org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:85)
at org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:62)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:109)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:88)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:109)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:212)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Connection to 192.168.1.195 closed.

On Tue, Apr 12, 2011 at 1:20 PM, Michel Conrad
michel.conrad@trendiction.com wrote:

I have a 5 server scenario to test with, am using a river I wrote to
import data from a cassandra database and I currently use
elasticsearch 0.15.2.

When I mean reset the database, I am deleting the folders where
elasticsearch stores its data (I am using a filebased storage type at
the moment).

After restarting the cluster, the health goes to green, but the river
is not being started on any node.

On Tue, Apr 12, 2011 at 1:00 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

When you say delete all storage folders for the database, what database is
it? Also, which river are you using, which thread failure are you referring
to (which implementation)? Is that a single elasticsearch node scenario now?

On Tuesday, April 12, 2011 at 1:55 PM, Michel Conrad wrote:

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Shay Banon) #9

If you have a custom river, then you need to make sure to handle cases of "river thread" existing and make sure it doesn't.

The log file entry you posted should not cause the river not to start. Its hard for me to help you since I can't run your river (and no, I am not going to run it). Can you recreate it from a clean install of elasticsearch using the "dummy" river?
On Tuesday, April 12, 2011 at 3:41 PM, Michel Conrad wrote:

Here is some additional information from the logfile:

[2011-04-12 14:33:52,321][WARN ][river ] [Smartship
Friday] failed to get _meta from [trendiction]/[dsearch_all]
org.elasticsearch.cluster.block.ClusterBlockException: blocked by:
[1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedException(ClusterBlocks.java:131)
at org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedRaiseException(ClusterBlocks.java:115)
at org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:85)
at org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:62)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:109)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:88)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:109)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:212)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Connection to 192.168.1.195 closed.

On Tue, Apr 12, 2011 at 1:20 PM, Michel Conrad
michel.conrad@trendiction.com wrote:

I have a 5 server scenario to test with, am using a river I wrote to
import data from a cassandra database and I currently use
elasticsearch 0.15.2.

When I mean reset the database, I am deleting the folders where
elasticsearch stores its data (I am using a filebased storage type at
the moment).

After restarting the cluster, the health goes to green, but the river
is not being started on any node.

On Tue, Apr 12, 2011 at 1:00 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

When you say delete all storage folders for the database, what database is
it? Also, which river are you using, which thread failure are you referring
to (which implementation)? Is that a single elasticsearch node scenario now?

On Tuesday, April 12, 2011 at 1:55 PM, Michel Conrad wrote:

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #10

Thanks for the fast reply, I will try it out with the dummy river and
post the results.

On Tue, Apr 12, 2011 at 3:23 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

If you have a custom river, then you need to make sure to handle cases of
"river thread" existing and make sure it doesn't.
The log file entry you posted should not cause the river not to start. Its
hard for me to help you since I can't run your river (and no, I am not going
to run it). Can you recreate it from a clean install of elasticsearch using
the "dummy" river?

On Tuesday, April 12, 2011 at 3:41 PM, Michel Conrad wrote:

Here is some additional information from the logfile:

[2011-04-12 14:33:52,321][WARN ][river ]] [Smartship
Friday] failed to get _meta from [trendiction]/[dsearch_all]
org.elasticsearch.cluster.block.ClusterBlockException: blocked by:
[1/state not recovered / initialized];
at
org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedException(ClusterBlocks.java:131)
at
org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedRaiseException(ClusterBlocks.java:115)
at
org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:85)
at
org.elasticsearch.action.get.TransportGetAction.checkBlock(TransportGetAction.java:62)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleeOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:109)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:88)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at
org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:109)
at
org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at
org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:212)
at
org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Connection to 192.168.1.195 closed.

On Tue, Apr 12, 2011 at 1:20 PM, Michel Conrad
michel.conrad@trendiction.com wrote:

I have a 5 server scenario to test with, am using a river I wrote to
import data from a cassandra database and I currently use
elasticsearch 0.15.2.

When I mean reset the database, I am deleting the folders where
elasticsearch stores its data (I am using a filebased storage type at
the moment).

After restarting the cluster, the health goes to green, but the river
is not being started on any node.

On Tue, Apr 12, 2011 at 1:00 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

When you say delete all storage folders for the database, what database is
it? Also, which river are you using, which thread failure are you referring
to (which implementation)? Is that a single elasticsearch node scenario now?

On Tuesday, April 12, 2011 at 1:55 PM, Michel Conrad wrote:

The issue reappereas on the test cluster, when I do the following steps:

  1. reset the database by deleting all the storage folders
  2. starting elasticsearch
  3. creating the river

-- now the river runs on one node, in this case 192.168.1.190

  1. when I kill the this node and immediately restart it, the river is
    not being restarted, the status of the river stays like this:

curl http://127.0.0.1:9200/_river/dsearch_all/_status?pretty=true
{
"_index" : "_river",
"_type" : "dsearch_all",
"_id" : "_status",
"_version" : 1, "_source" :
{"ok":true,"node":{"id":"toBTLyVKTIyvcRFHNp-RFg","name":"Marlene
Alraune","transport_address":"inet[/192.168.1.190:9300]"}}
}

additionally if the riverthread on the node is terminated by an
exception, the river is not restarted automatically.

On Tue, Mar 29, 2011 at 1:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

No, defaults should work and support for river failover... . When you have
more info, ping back.

On Monday, March 28, 2011 at 1:07 PM, Michel Conrad wrote:

I am currently trying to recreate the issue. Could it be that it has
something to do that, as I am using the default
configuration without reconfiguring replication or sharding, a part of
the data is simply not available leading the river to not
restart itself on another node?

On Mon, Mar 28, 2011 at 12:20 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

The river should be restarted on another node. Maybe you can recreate and
explain the states that it happens?

On Monday, March 28, 2011 at 11:54 AM, Michel Conrad wrote:

Hi,

I am just getting started with elasticsearch and I am running a
testcluster with 5 nodes. The indexing is getting done by a river.
After one node rebooted the river is not being started automatically
on another node. curl -XGET still shows that te river would still be
running
on the dead node. Am I missing a point?

Best,
Michel.


(Michel Conrad) #11

Even with the dummy river the failover is not working in my case. The
logfile is from the master node.
After starting the cluster ths river is run correctly. The logfile
goes up to this line:

  • [2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
    Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

When I kill the node running the river, the river is not restarted an
another node, but instead some errors appear. I am running only the
dummyriver
on the cluster.

[2011-04-12 16:13:28,990][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initializing ...
[2011-04-12 16:13:28,995][INFO ][plugins ] [Cody
Mushumanski gun Man aka: the hunter] loaded [river-trendiction]
[2011-04-12 16:13:30,027][WARN ][indices ] [Cody
Mushumanski gun Man aka: the hunter] lucene default FieldCache is
used, not enabling eager reader based cache eviction
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initialized
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
starting ...
[2011-04-12 16:13:30,225][INFO ][transport ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.190:9300]}
[2011-04-12 16:13:33,256][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] new_master [Cody Mushumanski gun
Man aka: the hunter][FG6YrkLqTTqwWxVnA99Xlg][inet[/192.168.1.190:9300]],
reason: zen-disco-join (elected_as_master)
[2011-04-12 16:13:33,270][INFO ][discovery ] [Cody
Mushumanski gun Man aka: the hunter]
trendictionsearch/FG6YrkLqTTqwWxVnA99Xlg
[2011-04-12 16:13:33,349][INFO ][http ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.190:9200]}
[2011-04-12 16:13:33,350][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
started
[2011-04-12 16:13:34,986][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-receive(join from node[[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]])
[2011-04-12 16:13:35,964][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]],},
reason: zen-disco-receive(join from
node[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]])
[2011-04-12 16:13:37,146][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]],},
reason: zen-disco-receive(join from
node[[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]]])
[2011-04-12 16:13:38,116][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]],},
reason: zen-disco-receive(join from
node[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]])
[2011-04-12 16:13:38,700][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] creating index, cause
[gateway], shards [1]/[2], mappings [dsearch_all, dummy_river]
[2011-04-12 16:13:38,870][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,041][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,178][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,294][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] creating index,
cause [gateway], shards [5]/[2], mappings [dsearch_all]
[2011-04-12 16:13:39,303][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] created and added to
cluster_state
[2011-04-12 16:13:39,311][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] created and
added to cluster_state
[2011-04-12 16:13:39,324][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] created and
added to cluster_state
[2011-04-12 16:13:39,334][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] created and
added to cluster_state
[2011-04-12 16:13:39,345][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] created and added
to cluster_state
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] create
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

[2011-04-12 16:14:40,632][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to create river
[dummy][dummy_river]
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,138][WARN ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] failed to reconnect to node
[Connors, Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]
org.elasticsearch.transport.ConnectTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]] connect_timeout[30s]
at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:512)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:473)
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:126)
at org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:301)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
... 3 more
[2011-04-12 16:15:40,635][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to write failed status for
river creation
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,637][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [120034ms] ago, timed out [90034ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]], id
[234]
[2011-04-12 16:15:40,650][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [90011ms] ago, timed out [60011ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]], id
[398]
[2011-04-12 16:15:40,659][DEBUG][action.search.type ] [Cody
Mushumanski gun Man aka: the hunter] [2] Failed to execute fetch phase
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/phase/fetch/id]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:318)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.innerExecuteFetchPhase(TransportSearchDfsQueryThenFetchAction.java:187)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetchPhase(TransportSearchDfsQueryThenFetchAction.java:164)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$600(TransportSearchDfsQueryThenFetchAction.java:63)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:145)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:140)
at org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:175)
at org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:168)
at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:132)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:102)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:299)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 32 more
[2011-04-12 16:15:40,670][WARN ][search.action ] [Cody
Mushumanski gun Man aka: the hunter] Failed to send release search
context
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/freeContext]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:95)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:319)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.finishHim(TransportSearchDfsQueryThenFetchAction.java:254)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$1300(TransportSearchDfsQueryThenFetchAction.java:63)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$6.onFailure(TransportSearchDfsQueryThenFetchAction.java:242)
at org.elasticsearch.search.action.SearchServiceTransportAction$8.handleException(SearchServiceTransportAction.java:329)
at org.elasticsearch.transport.TransportService$2.run(TransportService.java:197)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 11 more
[2011-04-12 16:15:40,831][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,957][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,968][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,978][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,982][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,985][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,987][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,990][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,993][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,995][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,001][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,004][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] removed {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-node_left([Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]])


(Shay Banon) #12

Can you list the exact steps you use to recreate it, including what to start, how many nodes to start, configuration, and so on.
On Tuesday, April 12, 2011 at 5:26 PM, Michel Conrad wrote:

Even with the dummy river the failover is not working in my case. The
logfile is from the master node.
After starting the cluster ths river is run correctly. The logfile
goes up to this line:

  • [2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
    Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

When I kill the node running the river, the river is not restarted an
another node, but instead some errors appear. I am running only the
dummyriver
on the cluster.

[2011-04-12 16:13:28,990][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initializing ...
[2011-04-12 16:13:28,995][INFO ][plugins ] [Cody
Mushumanski gun Man aka: the hunter] loaded [river-trendiction]
[2011-04-12 16:13:30,027][WARN ][indices ] [Cody
Mushumanski gun Man aka: the hunter] lucene default FieldCache is
used, not enabling eager reader based cache eviction
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initialized
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
starting ...
[2011-04-12 16:13:30,225][INFO ][transport ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.190:9300]}
[2011-04-12 16:13:33,256][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] new_master [Cody Mushumanski gun
Man aka: the hunter][FG6YrkLqTTqwWxVnA99Xlg][inet[/192.168.1.190:9300]],
reason: zen-disco-join (elected_as_master)
[2011-04-12 16:13:33,270][INFO ][discovery ] [Cody
Mushumanski gun Man aka: the hunter]
trendictionsearch/FG6YrkLqTTqwWxVnA99Xlg
[2011-04-12 16:13:33,349][INFO ][http ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.190:9200]}
[2011-04-12 16:13:33,350][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
started
[2011-04-12 16:13:34,986][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-receive(join from node[[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]])
[2011-04-12 16:13:35,964][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]],},
reason: zen-disco-receive(join from
node[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]])
[2011-04-12 16:13:37,146][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]],},
reason: zen-disco-receive(join from
node[[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]]])
[2011-04-12 16:13:38,116][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]],},
reason: zen-disco-receive(join from
node[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]])
[2011-04-12 16:13:38,700][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] creating index, cause
[gateway], shards [1]/[2], mappings [dsearch_all, dummy_river]
[2011-04-12 16:13:38,870][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,041][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,178][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,294][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] creating index,
cause [gateway], shards [5]/[2], mappings [dsearch_all]
[2011-04-12 16:13:39,303][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] created and added to
cluster_state
[2011-04-12 16:13:39,311][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] created and
added to cluster_state
[2011-04-12 16:13:39,324][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] created and
added to cluster_state
[2011-04-12 16:13:39,334][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] created and
added to cluster_state
[2011-04-12 16:13:39,345][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] created and added
to cluster_state
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] create
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

[2011-04-12 16:14:40,632][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to create river
[dummy][dummy_river]
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,138][WARN ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] failed to reconnect to node
[Connors, Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]
org.elasticsearch.transport.ConnectTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]] connect_timeout[30s]
at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:512)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:473)
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:126)
at org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:301)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
... 3 more
[2011-04-12 16:15:40,635][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to write failed status for
river creation
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,637][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [120034ms] ago, timed out [90034ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]], id
[234]
[2011-04-12 16:15:40,650][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [90011ms] ago, timed out [60011ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]], id
[398]
[2011-04-12 16:15:40,659][DEBUG][action.search.type ] [Cody
Mushumanski gun Man aka: the hunter] [2] Failed to execute fetch phase
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/phase/fetch/id]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:318)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.innerExecuteFetchPhase(TransportSearchDfsQueryThenFetchAction.java:187)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetchPhase(TransportSearchDfsQueryThenFetchAction.java:164)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$600(TransportSearchDfsQueryThenFetchAction.java:63)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:145)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:140)
at org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:175)
at org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:168)
at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:132)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:102)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:299)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 32 more
[2011-04-12 16:15:40,670][WARN ][search.action ] [Cody
Mushumanski gun Man aka: the hunter] Failed to send release search
context
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/freeContext]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:95)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:319)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.finishHim(TransportSearchDfsQueryThenFetchAction.java:254)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$1300(TransportSearchDfsQueryThenFetchAction.java:63)
at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$6.onFailure(TransportSearchDfsQueryThenFetchAction.java:242)
at org.elasticsearch.search.action.SearchServiceTransportAction$8.handleException(SearchServiceTransportAction.java:329)
at org.elasticsearch.transport.TransportService$2.run(TransportService.java:197)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 11 more
[2011-04-12 16:15:40,831][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,957][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,968][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,978][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,982][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,985][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,987][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,990][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,993][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,995][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,001][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,004][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] removed {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-node_left([Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]])


(Michel Conrad) #13

I start 5 nodes using the following configuration.
cluster:
name: search
path:
logs: /var/log/elasticsearch/
work: /hd1/elasticsearch_work/
data: /hd1/elasticsearch_data/
index:
number_of_replicas: 2
gateway:
recover_after_nodes: 3
recover_after_time: 5m
expected_nodes: 5

after the health is green I create the river:
curl -XPUT 'http://127.0.0.1:9200/_river/dummy_river/_meta' -d
'{"type":"dummy"}'

the river starts on a node. I get the status by: curl -XGET
http://127.0.0.1:9200/_river/dummy_river/_status?pretty=true

I kill the node where the river runs by:
pgrep -f elasticsearch | xargs kill

Then the errors from the previous email appear and the river is not
automatically restarted.
The status of the river stays the way is was.

On Tue, Apr 12, 2011 at 5:53 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

Can you list the exact steps you use to recreate it, including what to
start, how many nodes to start, configuration, and so on.

On Tuesday, April 12, 2011 at 5:26 PM, Michel Conrad wrote:

Even with the dummy river the failover is not working in my case. The
logfile is from the master node.
After starting the cluster ths river is run correctly. The logfile
goes up to this line:

  • [2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
    Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

When I kill the node running the river, the river is not restarted an
another node, but instead some errors appear. I am running only the
dummyriver
on the cluster.

[2011-04-12 16:13:28,990][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initializing ...
[2011-04-12 16:13:28,995][INFO ][plugins ] [Cody
Mushumanski gun Man aka: the hunter] loaded [river-trendiction]
[2011-04-12 16:13:30,027][WARN ][indices ] [Cody
Mushumanski gun Man aka: the hunter] lucene default FieldCache is
used, not enabling eager reader based cache eviction
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initialized
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
starting ...
[2011-04-12 16:13:30,225][INFO ][transport ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.190:9300]}
[2011-04-12 16:13:33,256][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] new_master [Cody Mushumanski gun
Man aka: the hunter][FG6YrkLqTTqwWxVnA99Xlg][inet[/192.168.1.190:9300]],
reason: zen-disco-join (elected_as_master)
[2011-04-12 16:13:33,270][INFO ][discovery ] [Cody
Mushumanski gun Man aka: the hunter]
trendictionsearch/FG6YrkLqTTqwWxVnA99Xlg
[2011-04-12 16:13:33,349][INFO ][http ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.190:9200]}
[2011-04-12 16:13:33,350][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
started
[2011-04-12 16:13:34,986][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-receive(join from node[[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]])
[2011-04-12 16:13:35,964][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]],},
reason: zen-disco-receive(join from
node[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]])
[2011-04-12 16:13:37,146][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]],},
reason: zen-disco-receive(join from
node[[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]]])
[2011-04-12 16:13:38,116][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] added
{[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]],},
reason: zen-disco-receive(join from
node[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]])
[2011-04-12 16:13:38,700][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] creating index, cause
[gateway], shards [1]/[2], mappings [dsearch_all, dummy_river]
[2011-04-12 16:13:38,870][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,041][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,178][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,294][INFO ][cluster.metadata ]] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] creating index,
cause [gateway], shards [5]/[2], mappings [dsearch_all]
[2011-04-12 16:13:39,303][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] created and added to
cluster_state
[2011-04-12 16:13:39,311][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] created and
added to cluster_state
[2011-04-12 16:13:39,324][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] created and
added to cluster_state
[2011-04-12 16:13:39,334][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] created and
added to cluster_state
[2011-04-12 16:13:39,345][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] created and added
to cluster_state
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] create
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

[2011-04-12 16:14:40,632][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to create river
[dummy][dummy_river]
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,138][WARN ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] failed to reconnect to node
[Connors, Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]
org.elasticsearch.transport.ConnectTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:512)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:473)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:126)
at
org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:301)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
... 3 more
[[2011-04-12 16:15:40,635][WARN ][river ]] [Cody
Mushumanski gun Man aka: the hunter] failed to write failed status for
river creation
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,637][WARN ][transport ]] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [120034ms] ago, timed out [90034ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]], id
[234]
[2011-04-12 16:15:40,650][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [90011ms] ago, timed out [60011ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]], id
[398]
[2011-04-12 16:15:40,659][DEBUG][action.search.type ] [Cody
Mushumanski gun Man aka: the hunter] [2] Failed to execute fetch phase
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/phase/fetch/id]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:318)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.innerExecuteFetchPhase(TransportSearchDfsQueryThenFetchAction.java:187)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetchPhase(TransportSearchDfsQueryThenFetchAction.java:164)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$600(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:145)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:140)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:175)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:168)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:132)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:102)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUppstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at
org.elasticsearch.common.netty.handler.codec.frame.FFrameDecoder.callDecode(FrameDecoder.java:299)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTrransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 32 more
[2011-04-12 16:15:40,670][WARN ][search.action ] [Cody
Mushumanski gun Man aka: the hunter] Failed to send release search
context
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/freeContext]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransporrtService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:95)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:319)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.finishHim(TransportSearchDfsQueryThenFetchAction.java:254)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$1300(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction$6.onFailure(TransportSearchDfsQueryThenFetchAction.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$8.handleException(SearchServiceTransportAction.java:329)
at
org.elasticsearch.transport.TransportService$2.run(TransportService.java:197)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 11 more
[[2011-04-12 16:15:40,831][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,957][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,968][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,978][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,982][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FFailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,985][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,987][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,990][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,993][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,995][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,001][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,004][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] removed {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-node_left([Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]])


(Shay Banon) #14

Great, found the problem, which is more fundamental basically revolving around node failure and reallocating the river to another node. Pushed a fix for it, so you can take master for a spin: https://github.com/elasticsearch/elasticsearch/issues/850.

Thanks for sticking around and helping flush out this important bug.
On Tuesday, April 12, 2011 at 7:46 PM, Michel Conrad wrote:

I start 5 nodes using the following configuration.
cluster:
name: search
path:
logs: /var/log/elasticsearch/
work: /hd1/elasticsearch_work/
data: /hd1/elasticsearch_data/
index:
number_of_replicas: 2
gateway:
recover_after_nodes: 3
recover_after_time: 5m
expected_nodes: 5

after the health is green I create the river:
curl -XPUT 'http://127.0.0.1:9200/_river/dummy_river/_meta' -d
'{"type":"dummy"}'

the river starts on a node. I get the status by: curl -XGET
http://127.0.0.1:9200/_river/dummy_river/_status?pretty=true

I kill the node where the river runs by:
pgrep -f elasticsearch | xargs kill

Then the errors from the previous email appear and the river is not
automatically restarted.
The status of the river stays the way is was.

On Tue, Apr 12, 2011 at 5:53 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

Can you list the exact steps you use to recreate it, including what to
start, how many nodes to start, configuration, and so on.

On Tuesday, April 12, 2011 at 5:26 PM, Michel Conrad wrote:

Even with the dummy river the failover is not working in my case. The
logfile is from the master node.
After starting the cluster ths river is run correctly. The logfile
goes up to this line:

  • [2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
    Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

When I kill the node running the river, the river is not restarted an
another node, but instead some errors appear. I am running only the
dummyriver
on the cluster.

[2011-04-12 16:13:28,990][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initializing ...
[2011-04-12 16:13:28,995][INFO ][plugins ] [Cody
Mushumanski gun Man aka: the hunter] loaded [river-trendiction]
[2011-04-12 16:13:30,027][WARN ][indices ] [Cody
Mushumanski gun Man aka: the hunter] lucene default FieldCache is
used, not enabling eager reader based cache eviction
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initialized
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
starting ...
[2011-04-12 16:13:30,225][INFO ][transport ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.190:9300]}
[2011-04-12 16:13:33,256][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] new_master [Cody Mushumanski gun
Man aka: the hunter][FG6YrkLqTTqwWxVnA99Xlg][inet[/192.168.1.190:9300]],
reason: zen-disco-join (elected_as_master)
[2011-04-12 16:13:33,270][INFO ][discovery ] [Cody
Mushumanski gun Man aka: the hunter]
trendictionsearch/FG6YrkLqTTqwWxVnA99Xlg
[2011-04-12 16:13:33,349][INFO ][http ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.190:9200]}
[2011-04-12 16:13:33,350][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
started
[2011-04-12 16:13:34,986][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-receive(join from node[[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]])
[2011-04-12 16:13:35,964][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]],},
reason: zen-disco-receive(join from
node[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]])
[2011-04-12 16:13:37,146][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]],},
reason: zen-disco-receive(join from
node[[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]]])
[2011-04-12 16:13:38,116][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] added
{[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]],},
reason: zen-disco-receive(join from
node[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]])
[2011-04-12 16:13:38,700][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] creating index, cause
[gateway], shards [1]/[2], mappings [dsearch_all, dummy_river]
[2011-04-12 16:13:38,870][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,041][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,178][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,294][INFO ][cluster.metadata ]] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] creating index,
cause [gateway], shards [5]/[2], mappings [dsearch_all]
[2011-04-12 16:13:39,303][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] created and added to
cluster_state
[2011-04-12 16:13:39,311][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] created and
added to cluster_state
[2011-04-12 16:13:39,324][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] created and
added to cluster_state
[2011-04-12 16:13:39,334][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] created and
added to cluster_state
[2011-04-12 16:13:39,345][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] created and added
to cluster_state
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] create
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

[2011-04-12 16:14:40,632][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to create river
[dummy][dummy_river]
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,138][WARN ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] failed to reconnect to node
[Connors, Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]
org.elasticsearch.transport.ConnectTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:512)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:473)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:126)
at
org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:301)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
... 3 more
[[2011-04-12 16:15:40,635][WARN ][river ]] [Cody
Mushumanski gun Man aka: the hunter] failed to write failed status for
river creation
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,637][WARN ][transport ]] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [120034ms] ago, timed out [90034ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]], id
[234]
[2011-04-12 16:15:40,650][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [90011ms] ago, timed out [60011ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]], id
[398]
[2011-04-12 16:15:40,659][DEBUG][action.search.type ] [Cody
Mushumanski gun Man aka: the hunter] [2] Failed to execute fetch phase
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/phase/fetch/id]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:318)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.innerExecuteFetchPhase(TransportSearchDfsQueryThenFetchAction.java:187)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetchPhase(TransportSearchDfsQueryThenFetchAction.java:164)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$600(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:145)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:140)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:175)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:168)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:132)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:102)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUppstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at
org.elasticsearch.common.netty.handler.codec.frame.FFrameDecoder.callDecode(FrameDecoder.java:299)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTrransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 32 more
[2011-04-12 16:15:40,670][WARN ][search.action ] [Cody
Mushumanski gun Man aka: the hunter] Failed to send release search
context
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/freeContext]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransporrtService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:95)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:319)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.finishHim(TransportSearchDfsQueryThenFetchAction.java:254)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$1300(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction$6.onFailure(TransportSearchDfsQueryThenFetchAction.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$8.handleException(SearchServiceTransportAction.java:329)
at
org.elasticsearch.transport.TransportService$2.run(TransportService.java:197)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 11 more
[[2011-04-12 16:15:40,831][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,957][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,968][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,978][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,982][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FFailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,985][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,987][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,990][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,993][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,995][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,001][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,004][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] removed {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-node_left([Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]])


(Michel Conrad) #15

I switched to master and gave it a try. Failover now works like a charm.

Thanks for the quick reaction and the fix.

On Tue, Apr 12, 2011 at 8:53 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

Great, found the problem, which is more fundamental basically revolving
around node failure and reallocating the river to another node. Pushed a fix
for it, so you can take master for a
spin: https://github.com/elasticsearch/elasticsearch/issues/850.
Thanks for sticking around and helping flush out this important bug.

On Tuesday, April 12, 2011 at 7:46 PM, Michel Conrad wrote:

I start 5 nodes using the following configuration.
cluster:
name: search
path:
logs: /var/log/elasticsearch/
work: /hd1/elasticsearch_work/
data: /hd1/elasticsearch_data/
index:
number_of_replicas: 2
gateway:
recover_after_nodes: 3
recover_after_time: 5m
expected_nodes: 5

after the health is green I create the river:
curl -XPUT 'http://127.0.0.1:9200/_river/dummy_river/_meta' -d
'{"type":"dummy"}'

the river starts on a node. I get the status by: curl -XGET
http://127.0.0.1:9200/_river/dummy_river/_status?pretty=true

I kill the node where the river runs by:
pgrep -f elasticsearch | xargs kill

Then the errors from the previous email appear and the river is not
automatically restarted.
The status of the river stays the way is was.

On Tue, Apr 12, 2011 at 5:53 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

Can you list the exact steps you use to recreate it, including what to
start, how many nodes to start, configuration, and so on.

On Tuesday, April 12, 2011 at 5:26 PM, Michel Conrad wrote:

Even with the dummy river the failover is not working in my case. The
logfile is from the master node.
After starting the cluster ths river is run correctly. The logfile
goes up to this line:

  • [2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
    Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

When I kill the node running the river, the river is not restarted an
another node, but instead some errors appear. I am running only the
dummyriver
on the cluster.

[2011-04-12 16:13:28,990][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initializing ...
[2011-04-12 16:13:28,995][INFO ][plugins ] [Cody
Mushumanski gun Man aka: the hunter] loaded [river-trendiction]
[2011-04-12 16:13:30,027][WARN ][indices ] [Cody
Mushumanski gun Man aka: the hunter] lucene default FieldCache is
used, not enabling eager reader based cache eviction
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
initialized
[2011-04-12 16:13:30,116][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
starting ...
[2011-04-12 16:13:30,225][INFO ][transport ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.190:9300]}
[2011-04-12 16:13:33,256][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] new_master [Cody Mushumanski gun
Man aka: the hunter][FG6YrkLqTTqwWxVnA99Xlg][inet[/192.168.1.190:9300]],
reason: zen-disco-join (elected_as_master)
[2011-04-12 16:13:33,270][INFO ][discovery ] [Cody
Mushumanski gun Man aka: the hunter]
trendictionsearch/FG6YrkLqTTqwWxVnA99Xlg
[2011-04-12 16:13:33,349][INFO ][http ] [Cody
Mushumanski gun Man aka: the hunter] bound_address
{inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.190:9200]}
[2011-04-12 16:13:33,350][INFO ][node ] [Cody
Mushumanski gun Man aka: the hunter] {elasticsearch/0.15.2}[8729]:
started
[2011-04-12 16:13:34,986][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-receive(join from node[[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]])
[2011-04-12 16:13:35,964][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]],},
reason: zen-disco-receive(join from
node[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]])
[2011-04-12 16:13:37,146][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] added
{[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]],},
reason: zen-disco-receive(join from
node[[Poison][OWoKy1PKRqKB9yTgfz_7Eg][inet[/192.168.1.191:9300]]])
[2011-04-12 16:13:38,116][INFO ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] added
{[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]],},
reason: zen-disco-receive(join from
node[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]])
[2011-04-12 16:13:38,700][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] creating index, cause
[gateway], shards [1]/[2], mappings [dsearch_all, dummy_river]
[2011-04-12 16:13:38,870][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,041][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,178][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] creating
index, cause [gateway], shards [5]/[2], mappings [content]
[2011-04-12 16:13:39,294][INFO ][cluster.metadata ]] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] creating index,
cause [gateway], shards [5]/[2], mappings [dsearch_all]
[2011-04-12 16:13:39,303][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [_river] created and added to
cluster_state
[2011-04-12 16:13:39,311][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adb] created and
added to cluster_state
[2011-04-12 16:13:39,324][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3adc] created and
added to cluster_state
[2011-04-12 16:13:39,334][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all_3add] created and
added to cluster_state
[2011-04-12 16:13:39,345][INFO ][cluster.metadata ] [Cody
Mushumanski gun Man aka: the hunter] [dsearch_all] created and added
to cluster_state
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] create
[2011-04-12 16:13:40,600][INFO ][river.dummy ] [Cody
Mushumanski gun Man aka: the hunter] [dummy][dummy_river] start

[2011-04-12 16:14:40,632][WARN ][river ] [Cody
Mushumanski gun Man aka: the hunter] failed to create river
[dummy][dummy_river]
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,138][WARN ][cluster.service ]] [Cody
Mushumanski gun Man aka: the hunter] failed to reconnect to node
[Connors, Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]]
org.elasticsearch.transport.ConnectTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:512)
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:473)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:126)
at
org.elasticsearch.cluster.service.InternalClusterService$ReconnectToNodes.run(InternalClusterService.java:301)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354)
at
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
... 3 more
[[2011-04-12 16:15:40,635][WARN ][river ]] [Cody
Mushumanski gun Man aka: the hunter] failed to write failed status for
river creation
org.elasticsearch.action.UnavailableShardsException: [_river][0] [3]
shardIt, [1] active : Timeout waiting for [1m], request: index
{[_river][dummy_river][_status],
source[{"ok":true,"node":{"id":"FG6YrkLqTTqwWxVnA99Xlg","name":"Cody
Mushumanski gun Man aka: the
hunter","transport_address":"inet[/192.168.1.190:9300]"}}]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:409)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:281)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-04-12 16:15:40,637][WARN ][transport ]] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [120034ms] ago, timed out [90034ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Zaladane][6pbf3UETTdiuiyHUIk6bwQ][inet[/192.168.1.194:9300]]], id
[234]
[2011-04-12 16:15:40,650][WARN ][transport ] [Cody
Mushumanski gun Man aka: the hunter] Received response for a request
that has timed out, sent [90011ms] ago, timed out [60011ms] ago,
action [/cluster/nodes/indices/shard/store/node], node
[[Dagger][DN31iuVQRw2MKAD3U5Q1mg][inet[/192.168.1.195:9300]]], id
[398]
[2011-04-12 16:15:40,659][DEBUG][action.search.type ] [Cody
Mushumanski gun Man aka: the hunter] [2] Failed to execute fetch phase
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/phase/fetch/id]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:318)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:226)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.innerExecuteFetchPhase(TransportSearchDfsQueryThenFetchAction.java:187)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetchPhase(TransportSearchDfsQueryThenFetchAction.java:164)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$600(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:145)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$3.onResult(TransportSearchDfsQueryThenFetchAction.java:140)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:175)
at
org.elasticsearch.search.action.SearchServiceTransportAction$3.handleResponse(SearchServiceTransportAction.java:168)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:132)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:102)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUppstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at
org.elasticsearch.common.netty.handler.codec.frame.FFrameDecoder.callDecode(FrameDecoder.java:299)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTrransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 32 more
[2011-04-12 16:15:40,670][WARN ][search.action ] [Cody
Mushumanski gun Man aka: the hunter] Failed to send release search
context
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][search/freeContext]
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:194)
at
org.elasticsearch.transport.TransportService.sendRequest(TransporrtService.java:166)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:95)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:319)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.finishHim(TransportSearchDfsQueryThenFetchAction.java:254)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.access$1300(TransportSearchDfsQueryThenFetchAction.java:63)
at
org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFFetchAction$AsyncAction$6.onFailure(TransportSearchDfsQueryThenFetchAction.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction$8.handleException(SearchServiceTransportAction.java:329)
at
org.elasticsearch.transport.TransportService$2.run(TransportService.java:197)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException:
[Connors, Curtis][inet[/192.168.1.192:9300]] Node not connected
at
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:566)
at
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:424)
at
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:179)
... 11 more
[[2011-04-12 16:15:40,831][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,957][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,968][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,978][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,982][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FFailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,985][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,987][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,990][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,993][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:40,995][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,001][WARN ][gateway.local ] [Cody
Mushumanski gun Man aka: the hunter] failures when trying to list
started shards on nodes:
-> org.elasticsearch.action.FailedNodeException: Failed node
[T5cWIcpXTIehneRXTALiLQ];
org.elasticsearch.transport.SendRequestTransportException: [Connors,
Curtis][inet[/192.168.1.192:9300]][/gateway/local/started-shards/node];
org.elasticsearch.transport.NodeNotConnectedException: [Connors,
Curtis][inet[/192.168.1.192:9300]] Node not connected
[2011-04-12 16:15:41,004][INFO ][cluster.service ] [Cody
Mushumanski gun Man aka: the hunter] removed {[Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]],}, reason:
zen-disco-node_left([Connors,
Curtis][T5cWIcpXTIehneRXTALiLQ][inet[/192.168.1.192:9300]])


(system) #16