When I shutdown the entire cluster, the river fails to start on the
next startup.
I am using the head version of the 0.16 branch and a local gateway.
The following warning is logged:
[2011-05-04 16:19:39,401][WARN ][river ]
[Moonstone] failed to get _meta from [testcluster]/[testriver]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0]
No shard available for [testriver#_meta]
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:193)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.performFirst(TransportShardSingleOperationAction.java:159)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:116)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:217)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It seems that the riverservice tries to get the _meta before the state
has been restored and never retries afterwards.
If I manually kill the node issuing the warning, the river gets
correctly started on another node.
When I shutdown the entire cluster, the river fails to start on the
next startup.
I am using the head version of the 0.16 branch and a local gateway.
The following warning is logged:
[2011-05-04 16:19:39,401][WARN ][river ]
[Moonstone] failed to get _meta from [testcluster]/[testriver]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0]
No shard available for [testriver#_meta]
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:193)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.performFirst(TransportShardSingleOperationAction.java:159)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:116)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:217)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It seems that the riverservice tries to get the _meta before the state
has been restored and never retries afterwards.
If I manually kill the node issuing the warning, the river gets
correctly started on another node.
I will try it out as soon as you push the fix.
What might be of help, I am currently using 5 nodes with 2 replicas.
It does not happen every time, but nearly every time.
Usually it happens if the river is being started on one of the first
nodes (in the order in which the nodes are started).
On Wednesday, May 4, 2011 at 5:41 PM, Michel Conrad wrote:
Hi,
When I shutdown the entire cluster, the river fails to start on the
next startup.
I am using the head version of the 0.16 branch and a local gateway.
The following warning is logged:
[2011-05-04 16:19:39,401][WARN ][river ]
[Moonstone] failed to get _meta from [testcluster]/[testriver]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0]
No shard available for [testriver#_meta]
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:193)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.performFirst(TransportShardSingleOperationAction.java:159)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:116)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at
org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at
org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at
org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:217)
at
org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It seems that the riverservice tries to get the _meta before the state
has been restored and never retries afterwards.
If I manually kill the node issuing the warning, the river gets
correctly started on another node.
I got some problems with rivers not starting correctly after a full
cluster shutdown. This time is is not a
NoShardAvailableActionException, but a IndexMissingException, where
the river will not be started. The river is only started correctly on
another node if I kill the node getting the exception. I think the
retry logic in RiversService.java might also retry the get when
an IndexMissingException occurs.
Best,
Michel
[2011-06-15 18:19:41,298][ESC[33mWARN
ESC[0m][discovery.zen.ping.multicast] [Thing] received ping response
with no matching id [1]
[2011-06-15 18:19:41,298][ESC[33mWARN
ESC[0m][discovery.zen.ping.multicast] [Thing] received ping response
with no matching id [1]
[2011-06-15 18:19:44,440][ESC[33mWARN ESC[0m][river
] [Thing] failed to get _meta from [trendiction]/[dsearch_en]
org.elasticsearch.indices.IndexMissingException: [_river] missing
at org.elasticsearch.cluster.metadata.MetaData.concreteIndex(MetaData.java:211)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:107)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:88)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:220)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-06-15 18:19:45,173][ESC[33mWARN ESC[0m][discovery
] [Thing] waited for 30s and no initial state was set by the discovery
I will try it out as soon as you push the fix.
What might be of help, I am currently using 5 nodes with 2 replicas.
It does not happen every time, but nearly every time.
Usually it happens if the river is being started on one of the first
nodes (in the order in which the nodes are started).
On Wednesday, May 4, 2011 at 5:41 PM, Michel Conrad wrote:
Hi,
When I shutdown the entire cluster, the river fails to start on the
next startup.
I am using the head version of the 0.16 branch and a local gateway.
The following warning is logged:
[2011-05-04 16:19:39,401][WARN ][river ]
[Moonstone] failed to get _meta from [testcluster]/[testriver]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0]
No shard available for [testriver#_meta]
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:193)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.performFirst(TransportShardSingleOperationAction.java:159)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:116)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at
org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at
org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at
org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:217)
at
org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It seems that the riverservice tries to get the _meta before the state
has been restored and never retries afterwards.
If I manually kill the node issuing the warning, the river gets
correctly started on another node.
On Thursday, June 16, 2011 at 11:25 AM, Michel Conrad wrote:
Hi,
I got some problems with rivers not starting correctly after a full
cluster shutdown. This time is is not a
NoShardAvailableActionException, but a IndexMissingException, where
the river will not be started. The river is only started correctly on
another node if I kill the node getting the exception. I think the
retry logic in RiversService.java might also retry the get when
an IndexMissingException occurs.
Best,
Michel
[2011-06-15 18:19:41,298][ESC[33mWARN
ESC[0m][discovery.zen.ping.multicast] [Thing] received ping response
with no matching id [1]
[2011-06-15 18:19:41,298][ESC[33mWARN
ESC[0m][discovery.zen.ping.multicast] [Thing] received ping response
with no matching id [1]
[2011-06-15 18:19:44,440][ESC[33mWARN ESC[0m][river
] [Thing] failed to get _meta from [trendiction]/[dsearch_en]
org.elasticsearch.indices.IndexMissingException: [_river] missing
at org.elasticsearch.cluster.metadata.MetaData.concreteIndex(MetaData.java:211)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:107)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:88)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GGetRequestBuilder.java:119)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.river.RiversService$$ApplyRivers.riverClusterChanged(RiversService.java:220)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
[2011-06-15 18:19:45,173][ESC[33mWARN ESC[0m][discovery
] [Thing] waited for 30s and no initial state was set by the discovery
I will try it out as soon as you push the fix.
What might be of help, I am currently using 5 nodes with 2 replicas.
It does not happen every time, but nearly every time.
Usually it happens if the river is being started on one of the first
nodes (in the order in which the nodes are started).
On Wednesday, May 4, 2011 at 5:41 PM, Michel Conrad wrote:
Hi,
When I shutdown the entire cluster, the river fails to start on the
next startup.
I am using the head version of the 0.16 branch and a local gateway.
The following warning is logged:
[2011-05-04 16:19:39,401][WARN ][river ]
[Moonstone] failed to get _meta from [testcluster]/[testriver]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0]
No shard available for [testriver#_meta]
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:193)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.performFirst(TransportShardSingleOperationAction.java:159)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:116)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:69)
at
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:45)
at org.elasticsearch.action.support.BaseAction.execute(BaseAction.java:61)
at org.elasticsearch.client.node.NodeClient.get(NodeClient.java:152)
at
org.elasticsearch.client.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:119)
at
org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at
org.elasticsearch.river.RiversService$ApplyRivers.riverClusterChanged(RiversService.java:217)
at
org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:126)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
It seems that the riverservice tries to get the _meta before the state
has been restored and never retries afterwards.
If I manually kill the node issuing the warning, the river gets
correctly started on another node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.