Random ordering of Shards in _segments API

I've been working on a simple site plugin to watch segments and merges in
real time, but I'm running into some trouble. When calling
//_segments/ to retrieve the segments in an index, it appears that
ES randomly orders the results. There is also no indication of a shard ID
like the /_cluster/state/ API returns. This is making it very difficult to
graph the segments, because there is no discernible difference between
shards other than the node and primary status.

E.g. on one request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": false,
"node": "J47gdOIwQMq2GTmzzmzJBA"
},
[...]

And on a subsequent request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": true,
"node": "kH152vsLTL-y20mcLFs9GQ"
},
[...]

As you can see, the first position (shard[0]) lists different nodes but no
way to distinguish which shard it belongs to on the local node. Is there a
better way to query this data? I suppose I could concatenate all the
shards on a node into a single graph...but it would be much more
informative if I can keep the various primaries and replicas separated.

Thanks!
-Zach

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

"0" is the shard id and the array corresponding to this shard contains a
list of all nodes where this shard is allocated. Yes, it doesn't have a
fixed order, but each entry has a node id (J47gdOIwQMq2GTmzzmzJBA
and kH152vsLTL-y20mcLFs9GQ). You can find the addresses and names of these
nodes by running

curl "localhost:9200/_nodes?pretty=true"

Since a shard cannot be allocated twice on the same node, the following
information "test":"0":"J47gdOIwQMq2GTmzzmzJBA" uniquely identifies the
shard in the first example as shard "0" of the index "test" allocated on
the node with id "J47gdOIwQMq2GTmzzmzJBA". From the flag "primary": false
we can conclude that it's currently a replica shard.

On Friday, February 1, 2013 12:59:11 PM UTC-5, Zachary Tong wrote:

I've been working on a simple site plugin to watch segments and merges in
real time, but I'm running into some trouble. When calling
//_segments/ to retrieve the segments in an index, it appears that
ES randomly orders the results. There is also no indication of a shard ID
like the /_cluster/state/ API returns. This is making it very difficult to
graph the segments, because there is no discernible difference between
shards other than the node and primary status.

E.g. on one request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": false,
"node": "J47gdOIwQMq2GTmzzmzJBA"
},
[...]

And on a subsequent request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": true,
"node": "kH152vsLTL-y20mcLFs9GQ"
},
[...]

As you can see, the first position (shard[0]) lists different nodes but no
way to distinguish which shard it belongs to on the local node. Is there a
better way to query this data? I suppose I could concatenate all the
shards on a node into a single graph...but it would be much more
informative if I can keep the various primaries and replicas separated.

Thanks!
-Zach

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Gah, you're correct, thanks for laying it out. I saw the random ordering
and assumed that meant none of the shards could be uniquely identified.
And since APIs like the cluster state explicitly define the shard ID I
convinced myself something was wrong. Oops =)

Thanks again!
-Zach

On Friday, February 1, 2013 9:24:57 PM UTC-5, Igor Motov wrote:

"0" is the shard id and the array corresponding to this shard contains a
list of all nodes where this shard is allocated. Yes, it doesn't have a
fixed order, but each entry has a node id (J47gdOIwQMq2GTmzzmzJBA
and kH152vsLTL-y20mcLFs9GQ). You can find the addresses and names of these
nodes by running

curl "localhost:9200/_nodes?pretty=true"

Since a shard cannot be allocated twice on the same node, the following
information "test":"0":"J47gdOIwQMq2GTmzzmzJBA" uniquely identifies the
shard in the first example as shard "0" of the index "test" allocated on
the node with id "J47gdOIwQMq2GTmzzmzJBA". From the flag "primary": false
we can conclude that it's currently a replica shard.

On Friday, February 1, 2013 12:59:11 PM UTC-5, Zachary Tong wrote:

I've been working on a simple site plugin to watch segments and merges in
real time, but I'm running into some trouble. When calling
//_segments/ to retrieve the segments in an index, it appears that
ES randomly orders the results. There is also no indication of a shard ID
like the /_cluster/state/ API returns. This is making it very difficult to
graph the segments, because there is no discernible difference between
shards other than the node and primary status.

E.g. on one request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": false,
"node": "J47gdOIwQMq2GTmzzmzJBA"
},
[...]

And on a subsequent request:
"indices": {
"test": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": true,
"node": "kH152vsLTL-y20mcLFs9GQ"
},
[...]

As you can see, the first position (shard[0]) lists different nodes but
no way to distinguish which shard it belongs to on the local node. Is
there a better way to query this data? I suppose I could concatenate all
the shards on a node into a single graph...but it would be much more
informative if I can keep the various primaries and replicas separated.

Thanks!
-Zach

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.