Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have 2
indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Whole weekend - the same problem. One index fails to replicate from master
another index is Ok. I've restarted whole ES cluster, no help. I need any
advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have 2
indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
On Sunday, July 21, 2013 5:18:50 PM UTC+2, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from master
another index is Ok. I've restarted whole ES cluster, no help. I need any
advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
May be this help someone in future. I had to stop my applications all
together, causing almost an hour of production downtime. After that ES was
able to replicate. I hope there is better solution, if someone knows,
please share.
On Sunday, July 21, 2013 11:18:50 AM UTC-4, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from master
another index is Ok. I've restarted whole ES cluster, no help. I need any
advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
May be this help someone in future. I had to stop my applications all
together, causing almost an hour of production downtime. After that ES was
able to replicate. I hope there is better solution, if someone knows,
please share.
On Sunday, July 21, 2013 11:18:50 AM UTC-4, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from
master another index is Ok. I've restarted whole ES cluster, no help. I
need any advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Hello Boaz,
that was bothering me the most, all the logs were clean. I've set DEBUG
level to everything I would thought of, and still nothing suspicious in the
logs.
It showed me that one index is restored shard by shard, and that was it.
While the second index was constantly in the loop, I didn't see new log
records at all.
Eugene
On Tuesday, July 23, 2013 9:33:59 AM UTC-4, Boaz Leskes wrote:
Hi Eugene,
Google marked my reaction as spam, so I guess you didn't get it... did you
see anything in the logs about this?
Cheers,
Boaz
On Tue, Jul 23, 2013 at 3:11 PM, Eugene Strokin <eug...@strokin.info<javascript:>
wrote:
May be this help someone in future. I had to stop my applications all
together, causing almost an hour of production downtime. After that ES was
able to replicate. I hope there is better solution, if someone knows,
please share.
On Sunday, July 21, 2013 11:18:50 AM UTC-4, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from
master another index is Ok. I've restarted whole ES cluster, no help. I
need any advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I
have 2 indexes, one is mostly Read-Only, another has heavier writing. The
first one completely recovered from the master and all shards are in green.
The second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Not the ideal solution, but whenever I get stuck recovering shards, I
simply set my replicas to 0 and let the cluster get back to green. From
there, I up the replica count. You will be in a situation where if a node
went down, you will not have all the shards for an index, but it gets me
out of the first issue.
May be this help someone in future. I had to stop my applications all
together, causing almost an hour of production downtime. After that ES was
able to replicate. I hope there is better solution, if someone knows,
please share.
On Sunday, July 21, 2013 11:18:50 AM UTC-4, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from
master another index is Ok. I've restarted whole ES cluster, no help. I
need any advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Thanks Ivan,
I'm trying adding another node to the cluster, and got stock with the same
problem again.
I'm trying your method, it is better than shutting down whole system and
wait for replication finished.
I've set replicas to 0, I see my cluster reports green now. But 2 out of 5
of my shards now constantly in RELOCATING state, even though the replicas
number is 0.
Did you see this as well? Should I just wait? Anything else could be done
here?
Thanks for your help,
Eugene
On Tuesday, July 23, 2013 6:45:16 PM UTC-4, Ivan Brusic wrote:
Not the ideal solution, but whenever I get stuck recovering shards, I
simply set my replicas to 0 and let the cluster get back to green. From
there, I up the replica count. You will be in a situation where if a node
went down, you will not have all the shards for an index, but it gets me
out of the first issue.
--
Ivan
On Tue, Jul 23, 2013 at 6:11 AM, Eugene Strokin <eug...@strokin.info<javascript:>
wrote:
May be this help someone in future. I had to stop my applications all
together, causing almost an hour of production downtime. After that ES was
able to replicate. I hope there is better solution, if someone knows,
please share.
On Sunday, July 21, 2013 11:18:50 AM UTC-4, Eugene Strokin wrote:
Whole weekend - the same problem. One index fails to replicate from
master another index is Ok. I've restarted whole ES cluster, no help. I
need any advice. I have no clue what's going on, all logs are clean.
Thanks in advance,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I
have 2 indexes, one is mostly Read-Only, another has heavier writing. The
first one completely recovered from the master and all shards are in green.
The second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Thank you,
Eugene
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
Ok,.. I've finally got my cluster into healthy state. I still don't know
what the problem was, but here what I've done:
Initial problem: Added a new node into the cluster, replication is not
happening. Shards are in RECOVERY state constantly.
Set number of replicas to 0 -> I've got 2 shards relocated to the new
node, and 3 shards are still in the existing node.
Set number of replicas to 1 -> the cluster is trying to replicate, but
the same pattern repeats - 2 shards are getting from unassigned to recovery
state, then back to unassigned.
Added a 3rd node into the cluster, set number of replicas to 2 -> Some
shards got replicated some are still looping to recovery and back to
unassigned.
Luckily got at least 1 replica for each shard on different nodes, some
shards are still looping into unassigned state, shutdown the 1st node -
master. Master got reelected, all shads got replicated. I've set number of
replicas to 1, and the cluster is green now.
I'm guessing I had some problem with the master, and even restart didn't
help. But once the master got reelected, the situation got normalized.
Thanks for you help,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have 2
indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
That sucks. This sounds like a java backward compatibility issue we run
into a while ago. Can you post the exact version numbers ES runs with for
all nodes? The easiest is the gist the output of http://localhost:9200/_nodes?all which might reveal more interesting
information.
Ok,.. I've finally got my cluster into healthy state. I still don't know
what the problem was, but here what I've done:
Initial problem: Added a new node into the cluster, replication is not
happening. Shards are in RECOVERY state constantly.
Set number of replicas to 0 -> I've got 2 shards relocated to the new
node, and 3 shards are still in the existing node.
Set number of replicas to 1 -> the cluster is trying to replicate, but
the same pattern repeats - 2 shards are getting from unassigned to recovery
state, then back to unassigned.
Added a 3rd node into the cluster, set number of replicas to 2 -> Some
shards got replicated some are still looping to recovery and back to
unassigned.
Luckily got at least 1 replica for each shard on different nodes, some
shards are still looping into unassigned state, shutdown the 1st node -
master. Master got reelected, all shads got replicated. I've set number of
replicas to 1, and the cluster is green now.
I'm guessing I had some problem with the master, and even restart didn't
help. But once the master got reelected, the situation got normalized.
Thanks for you help,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Is there an issue open for this already in github? I'm having the same
issue in our system and we are getting around it by doing what Ivan posted.
In our system (two master and data nodes in cluster and many clients
(connecting with TransportClient) ) the JVMs are exactly the same version:
version: "1.7.0_25"
vm_name: "Java HotSpot(TM) 64-Bit Server VM"
vm_version: "23.25-b01"
vm_vendor: "Oracle Corporation"
and ES are also exactly the same: 0.90.5.
Thanks!
On Monday, August 12, 2013 9:31:34 AM UTC-7, Boaz Leskes wrote:
Hi Eugene,
That sucks. This sounds like a java backward compatibility issue we run
into a while ago. Can you post the exact version numbers ES runs with for
all nodes? The easiest is the gist the output of http://localhost:9200/_nodes?all which might reveal more interesting
information.
Cheers,
Boaz
On Mon, Aug 12, 2013 at 5:26 PM, Eugene Strokin <eug...@strokin.info<javascript:>
wrote:
Ok,.. I've finally got my cluster into healthy state. I still don't know
what the problem was, but here what I've done:
Initial problem: Added a new node into the cluster, replication is not
happening. Shards are in RECOVERY state constantly.
Set number of replicas to 0 -> I've got 2 shards relocated to the new
node, and 3 shards are still in the existing node.
Set number of replicas to 1 -> the cluster is trying to replicate, but
the same pattern repeats - 2 shards are getting from unassigned to recovery
state, then back to unassigned.
Added a 3rd node into the cluster, set number of replicas to 2 -> Some
shards got replicated some are still looping to recovery and back to
unassigned.
Luckily got at least 1 replica for each shard on different nodes, some
shards are still looping into unassigned state, shutdown the 1st node -
master. Master got reelected, all shads got replicated. I've set number of
replicas to 1, and the cluster is green now.
I'm guessing I had some problem with the master, and even restart didn't
help. But once the master got reelected, the situation got normalized.
Thanks for you help,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I have
2 indexes, one is mostly Read-Only, another has heavier writing. The first
one completely recovered from the master and all shards are in green. The
second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Is there an issue open for this already in github? I'm having the same
issue in our system and we are getting around it by doing what Ivan posted.
In our system (two master and data nodes in cluster and many clients
(connecting with TransportClient) ) the JVMs are exactly the same version:
version: "1.7.0_25"
vm_name: "Java HotSpot(TM) 64-Bit Server VM"
vm_version: "23.25-b01"
vm_vendor: "Oracle Corporation"
and ES are also exactly the same: 0.90.5.
Thanks!
On Monday, August 12, 2013 9:31:34 AM UTC-7, Boaz Leskes wrote:
Hi Eugene,
That sucks. This sounds like a java backward compatibility issue we run
into a while ago. Can you post the exact version numbers ES runs with for
all nodes? The easiest is the gist the output of http://localhost:9200/_nodes?**allhttp://localhost:9200/_nodes?all which might reveal more interesting information.
Ok,.. I've finally got my cluster into healthy state. I still don't know
what the problem was, but here what I've done:
Initial problem: Added a new node into the cluster, replication is not
happening. Shards are in RECOVERY state constantly.
Set number of replicas to 0 -> I've got 2 shards relocated to the new
node, and 3 shards are still in the existing node.
Set number of replicas to 1 -> the cluster is trying to replicate, but
the same pattern repeats - 2 shards are getting from unassigned to recovery
state, then back to unassigned.
Added a 3rd node into the cluster, set number of replicas to 2 -> Some
shards got replicated some are still looping to recovery and back to
unassigned.
Luckily got at least 1 replica for each shard on different nodes, some
shards are still looping into unassigned state, shutdown the 1st node -
master. Master got reelected, all shads got replicated. I've set number of
replicas to 1, and the cluster is green now.
I'm guessing I had some problem with the master, and even restart didn't
help. But once the master got reelected, the situation got normalized.
Thanks for you help,
Eugene
On Friday, July 19, 2013 7:56:24 PM UTC-4, Eugene Strokin wrote:
Hello,
I've added a new server into an ES cluster. It was recognized Ok. I
have 2 indexes, one is mostly Read-Only, another has heavier writing. The
first one completely recovered from the master and all shards are in green.
The second index is trying to recover for almost 6 hours and no progress at
all. I see that it puts 2 shards in Recovering mode and gets a little data
(around 1Mb), then it gets them back to Unassigned mode, and tries another
2 shards. And it repeats this pattern forever.
What could it be? How can I solve it?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.