Wait for relocating shards

The ClusterHealthRequest allows one to wait for a specified number of
relocating shards. Would it be possible to extend the class to wait for all
relocating shards? Currently, I need to make one request to get the number
of relocating shards and then another request to wait for those shards to
finish relocating. Eliminating the need for the first request would be
optimal.

My use case is to move an alias and ever administrative tasks once a full
index has been rolled out after the number of replicas has been increased
from 0 (set to zero to speed up a full re-index).

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

My questions always seem to go un-answered. I guess I need to ask why my
term query doesn't work instead of the more esoteric inner workings. :slight_smile:

In the end, waiting for relocating shards might not be the answer since
newly added replicas are in the recovering state, not relocating. Does
anybody have a workflow that waits until the cluster is green after
increasing the number of replicas after a bulk index? Wait for green does
not work since the cluster is green for a bit before finally turning to
yellow. My current process is to wait for yellow, then to wait for green,
which is kludgy and does not always work.

Cheers,
Ivan

On Tue, Jan 29, 2013 at 12:51 PM, Ivan Brusic ivan@brusic.com wrote:

The ClusterHealthRequest allows one to wait for a specified number of
relocating shards. Would it be possible to extend the class to wait for all
relocating shards? Currently, I need to make one request to get the number
of relocating shards and then another request to wait for those shards to
finish relocating. Eliminating the need for the first request would be
optimal.

My use case is to move an alias and ever administrative tasks once a full
index has been rolled out after the number of replicas has been increased
from 0 (set to zero to speed up a full re-index).

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Before bulk indexing, I wait for the cluster health yellow. Yellow
means, cluster is available, but not necessarily with enough replica
shards to survive a node failure.

Waiting for green should be the exception, it means, you wait for enough
nodes so that even a node failure will not degrade the cluster
availiablity, i.e. enough replica shards are there. I think the presence
of enough replica shards should not affect bulk indexing, in fact, I
create indices with replica 0 for bulk indexing, and add the replica
level later.

Jörg

Am 31.01.13 18:06, schrieb Ivan Brusic:

My questions always seem to go un-answered. I guess I need to ask why
my term query doesn't work instead of the more esoteric inner
workings. :slight_smile:

In the end, waiting for relocating shards might not be the answer
since newly added replicas are in the recovering state, not
relocating. Does anybody have a workflow that waits until the cluster
is green after increasing the number of replicas after a bulk index?
Wait for green does not work since the cluster is green for a bit
before finally turning to yellow. My current process is to wait for
yellow, then to wait for green, which is kludgy and does not always work.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Once you send the update settings with the increased replica count, and wait for it to return, the next call for health will take the increased replica count into account. Or is that not the order you call the APIs?

On Jan 31, 2013, at 6:06 PM, Ivan Brusic ivan@brusic.com wrote:

My questions always seem to go un-answered. I guess I need to ask why my term query doesn't work instead of the more esoteric inner workings. :slight_smile:

In the end, waiting for relocating shards might not be the answer since newly added replicas are in the recovering state, not relocating. Does anybody have a workflow that waits until the cluster is green after increasing the number of replicas after a bulk index? Wait for green does not work since the cluster is green for a bit before finally turning to yellow. My current process is to wait for yellow, then to wait for green, which is kludgy and does not always work.

Cheers,
Ivan

On Tue, Jan 29, 2013 at 12:51 PM, Ivan Brusic ivan@brusic.com wrote:
The ClusterHealthRequest allows one to wait for a specified number of relocating shards. Would it be possible to extend the class to wait for all relocating shards? Currently, I need to make one request to get the number of relocating shards and then another request to wait for those shards to finish relocating. Eliminating the need for the first request would be optimal.

My use case is to move an alias and ever administrative tasks once a full index has been rolled out after the number of replicas has been increased from 0 (set to zero to speed up a full re-index).

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jörg, the issue is not bulk indexing, but afterwards.

After bulk indexing, I want to wait for the cluster to return to green
before executing some administrative tasks such as moving aliases.
Increasing the replica count does not send the cluster to a yellow state
immediately, so waiting for green is not possible since the green will be
green momentarily.

The current solution is too wait for yellow and then to wait for green.
Anything suggestions for something less kludgy?

Cheers,

Ivan

On Fri, Feb 1, 2013 at 3:50 AM, kimchy@gmail.com wrote:

Once you send the update settings with the increased replica count, and
wait for it to return, the next call for health will take the increased
replica count into account. Or is that not the order you call the APIs?

On Jan 31, 2013, at 6:06 PM, Ivan Brusic ivan@brusic.com wrote:

My questions always seem to go un-answered. I guess I need to ask why my
term query doesn't work instead of the more esoteric inner workings. :slight_smile:

In the end, waiting for relocating shards might not be the answer since
newly added replicas are in the recovering state, not relocating. Does
anybody have a workflow that waits until the cluster is green after
increasing the number of replicas after a bulk index? Wait for green does
not work since the cluster is green for a bit before finally turning to
yellow. My current process is to wait for yellow, then to wait for green,
which is kludgy and does not always work.

Cheers,
Ivan

On Tue, Jan 29, 2013 at 12:51 PM, Ivan Brusic ivan@brusic.com wrote:

The ClusterHealthRequest allows one to wait for a specified number of
relocating shards. Would it be possible to extend the class to wait for all
relocating shards? Currently, I need to make one request to get the number
of relocating shards and then another request to wait for those shards to
finish relocating. Eliminating the need for the first request would be
optimal.

My use case is to move an alias and ever administrative tasks once a full
index has been rolled out after the number of replicas has been increased
from 0 (set to zero to speed up a full re-index).

Ivan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan,

I understand, but have you looked at
org.elasticsearch.action.admin.cluster.health.ClusterHealthRequest/Response?

You can obtain fine grained control over everything - waiting for nodes,
shards, if they relocate, initialize. In ClusterHealthResponse, ES will
provide you everything about the current health state. There are several
methods in ClusterHealthResponse

activeShards(), relocatingShards(), activePrimaryShards(),
initializingShards(), unassignedShards(), numberOfNodes(),
numberOfDataNodes()

that allow you to enter a loop to wait for the exact state of the
cluster your application requires, or an exact state to let your
application abort.

Note, "yellow" and "green" are just useful traffic light mnemonics for
the reliability level of the cluster - I won't call them kludgy. They
are of great assistance so nobody needs to write programs that poll the
cluster over and over again just to count nodes or shards for computing
the current cluster reliability (and maybe doing it wrong).

In my use case, I enable refresh and add a replica level after bulk
indexing, and I don't wait for cluster health before search continues.
And yes, it works smoothly. I wait for green only before bulk indexing
starts, not afterwards, when it completes.

Best regards,

Jörg

Am 04.02.13 19:35, schrieb Ivan Brusic:

Jörg, the issue is not bulk indexing, but afterwards.

After bulk indexing, I want to wait for the cluster to return to green
before executing some administrative tasks such as moving aliases.
Increasing the replica count does not send the cluster to a yellow
state immediately, so waiting for green is not possible since the
green will be green momentarily.

The current solution is too wait for yellow and then to wait for
green. Anything suggestions for something less kludgy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Mon, Feb 4, 2013 at 11:09 AM, Jörg Prante joergprante@gmail.com wrote:

Ivan,

I understand, but have you looked at org.elasticsearch.action.**
admin.cluster.health.**ClusterHealthRequest/Response?

ClusterHealthRequest was the second word in my post. :slight_smile:

You can obtain fine grained control over everything - waiting for nodes,
shards, if they relocate, initialize. In ClusterHealthResponse, ES will
provide you everything about the current health state. There are several
methods in ClusterHealthResponse

activeShards(), relocatingShards(), activePrimaryShards(),
initializingShards(), unassignedShards(), numberOfNodes(),
numberOfDataNodes()

Part of the problem is that newly added shards are setting to RECOVERING,
and there is no wait for recovering.

Note, "yellow" and "green" are just useful traffic light mnemonics for the
reliability level of the cluster - I won't call them kludgy.

I meant to say that my process of waiting first for yellow and then green
is kludgy. Querying the cluster state for the total number of shards and
then waiting for those shards to be active will also require two calls (in
addition to the call to increase replicas).

Due to an issue with my code (was executing an non-blocking call), searches
were being executed on the index before the shards were completely
replicated. No issues with timeouts, but did not calculate how much of a
performance impact there was, if any.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes, unfortunately, checking an index for recovering shards is not part
of the cluster health API, but it's in the indices part of the admin
client. See IndicesStatusRequestBuilder with setRecovery(true). The
reason is, cluster health traverses the routing nodes and routing
tables, whereas the indices status checks the gateway service.

Jörg

Am 04.02.13 20:30, schrieb Ivan Brusic:

On Mon, Feb 4, 2013 at 11:09 AM, Jörg Prante <joergprante@gmail.com
mailto:joergprante@gmail.com> wrote:

Part of the problem is that newly added shards are setting to
RECOVERING, and there is no wait for recovering.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.