At the bottom of the display it shows a summary line for each index
with a Show Details button. When click that reveals a series of boxes
for each shard (columns) on each node (rows).
In each box there's a status heading and two figures: a time and a size.
There's no indication what the figures relate to, other than relating to
a particular shard on a particular node, obviously.
I'm especially interested because we're having some performance issues
and these number may shed light on it. For reference we have 3 nodes and
our indices are about 5GB and get rolled over daily.
Most of the figures are very low, say 20ms and 58b, but the shards for
one row (node) show very high figures, say 52s and 1012mb. (I also once
saw high values in just one column, i.e., one shard across several nodes.)
It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about the
numbers.
I'd be grateful if someone could shed some light on them.
Most of the figures are very low, say 20ms and 58b, but the shards
for one row (node) show very high figures, say 52s and 1012mb. (I
also once saw high values in just one column, i.e., one shard
across several nodes.)
It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about
the numbers.
I'd be grateful if someone could shed some light on them.
They're related to recovery, which is ES's term for initializing
shards and making them ready for use. I believe the numbers are only
updated when the shard is first recovered. Mine do not seem to
update for ongoing replication.
The primary shards (blue) will probably never have large numbers
because they don't need to recover data (unless a replica has been
promoted to a primary). And you may small numbers on replica shards
if they were created at index time.
From my usage, the numbers seems to be large when a replica was
created from a large-ish primary shard and had to be recovered with a
non-trivial amount of data. In your case, that 52s/1G was likely an
initialized shard on a remote node, which is effectively 19.69MBps.
That may or may not be acceptable depending on your network, but it's
likely normal operation.
The first value is indeed "time" spent recovering the shard, and the
second value is the size in MB/GB of the index (see the raw data)
These numbers indeed stay the same once the shard has been successfuly
loaded/recovered. If one of the shards of the same index is too big
compared to the other shards (of the same index), you may be using the routing feature in a sub-optimal way, creating a "hot shard" with too
much data.
Karel
On Wednesday, January 9, 2013 9:44:23 PM UTC+1, Drew Raines wrote:
and finding it helpful to monitor our Elasticsearch cluster.
[...]
Most of the figures are very low, say 20ms and 58b, but the shards
for one row (node) show very high figures, say 52s and 1012mb. (I
also once saw high values in just one column, i.e., one shard
across several nodes.)
It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about
the numbers.
I'd be grateful if someone could shed some light on them.
They're related to recovery, which is ES's term for initializing
shards and making them ready for use. I believe the numbers are only
updated when the shard is first recovered. Mine do not seem to
update for ongoing replication.
The primary shards (blue) will probably never have large numbers
because they don't need to recover data (unless a replica has been
promoted to a primary). And you may small numbers on replica shards
if they were created at index time.
From my usage, the numbers seems to be large when a replica was
created from a large-ish primary shard and had to be recovered with a
non-trivial amount of data. In your case, that 52s/1G was likely an
initialized shard on a remote node, which is effectively 19.69MBps.
That may or may not be acceptable depending on your network, but it's
likely normal operation.
The first value is indeed "time" spent recovering the shard, and the second value is the size in MB/GB
of the index (see the raw data)
These numbers indeed stay the same once the shard has been successfuly loaded/recovered. If one of the
shards of the same index is too big compared to the other shards (of the same index), you may be using
the routing feature in a sub-optimal way, creating a "hot shard" with too much data.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.