Paramedic - what do the the values in the index shard details mean?

timbunce · January 9, 2013, 12:48pm

We're using Paramedic (https://github.com/karmi/elasticsearch-paramedic)
and finding it helpful to monitor our ElasticSearch cluster.

At the bottom of the display it shows a summary line for each index
with a Show Details button. When click that reveals a series of boxes
for each shard (columns) on each node (rows).

In each box there's a status heading and two figures: a time and a size.
There's no indication what the figures relate to, other than relating to
a particular shard on a particular node, obviously.

I'm especially interested because we're having some performance issues
and these number may shed light on it. For reference we have 3 nodes and
our indices are about 5GB and get rolled over daily.

Most of the figures are very low, say 20ms and 58b, but the shards for
one row (node) show very high figures, say 52s and 1012mb. (I also once
saw high values in just one column, i.e., one shard across several nodes.)

It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about the
numbers.

I'd be grateful if someone could shed some light on them.

Tim.

--

drewr · January 9, 2013, 8:44pm

Tim Bunce wrote:

We're using Paramedic (GitHub - karmi/elasticsearch-paramedic: A simple tool to inspect the state and statistics about ElasticSearch clusters)
and finding it helpful to monitor our Elasticsearch cluster.

[...]

Most of the figures are very low, say 20ms and 58b, but the shards
for one row (node) show very high figures, say 52s and 1012mb. (I
also once saw high values in just one column, i.e., one shard
across several nodes.)

It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about
the numbers.

I'd be grateful if someone could shed some light on them.

They're related to recovery, which is ES's term for initializing
shards and making them ready for use. I believe the numbers are only
updated when the shard is first recovered. Mine do not seem to
update for ongoing replication.

The primary shards (blue) will probably never have large numbers
because they don't need to recover data (unless a replica has been
promoted to a primary). And you may small numbers on replica shards
if they were created at index time.

From my usage, the numbers seems to be large when a replica was
created from a large-ish primary shard and had to be recovered with a
non-trivial amount of data. In your case, that 52s/1G was likely an
initialized shard on a remote node, which is effectively 19.69MBps.
That may or may not be acceptable depending on your network, but it's
likely normal operation.

-Drew

--

Karel_Minarik_2 · January 10, 2013, 2:45pm

To add to Drew's earlier explanation:

Paramedic is just an interface to the "Index Status API" here, to see the
raw data: http://localhost:9200/_status?recovery=true
The first value is indeed "time" spent recovering the shard, and the
second value is the size in MB/GB of the index (see the raw data)

These numbers indeed stay the same once the shard has been successfuly
loaded/recovered. If one of the shards of the same index is too big
compared to the other shards (of the same index), you may be using the
routing feature in a sub-optimal way, creating a "hot shard" with too
much data.

Karel

On Wednesday, January 9, 2013 9:44:23 PM UTC+1, Drew Raines wrote:

Tim Bunce wrote:

We're using Paramedic (GitHub - karmi/elasticsearch-paramedic: A simple tool to inspect the state and statistics about ElasticSearch clusters)

and finding it helpful to monitor our Elasticsearch cluster.

[...]

Most of the figures are very low, say 20ms and 58b, but the shards
for one row (node) show very high figures, say 52s and 1012mb. (I
also once saw high values in just one column, i.e., one shard
across several nodes.)

It's possible it may be due to the particular way we're loading and
querying the data, but I can't be sure without knowing more about
the numbers.

I'd be grateful if someone could shed some light on them.

They're related to recovery, which is ES's term for initializing
shards and making them ready for use. I believe the numbers are only
updated when the shard is first recovered. Mine do not seem to
update for ongoing replication.

The primary shards (blue) will probably never have large numbers
because they don't need to recover data (unless a replica has been
promoted to a primary). And you may small numbers on replica shards
if they were created at index time.

From my usage, the numbers seems to be large when a replica was
created from a large-ish primary shard and had to be recovered with a
non-trivial amount of data. In your case, that 52s/1G was likely an
initialized shard on a remote node, which is effectively 19.69MBps.
That may or may not be acceptable depending on your network, but it's
likely normal operation.

-Drew

--

timbunce · January 10, 2013, 7:27pm

On Thu, Jan 10, 2013 at 06:45:50AM -0800, Karel Minařík wrote:

To add to Drew's earlier explanation:

Paramedic is just an interface to the "Index Status API" here, to see the raw
data: http://localhost:9200/_status?recovery=true

The first value is indeed "time" spent recovering the shard, and the second value is the size in MB/GB
of the index (see the raw data)
These numbers indeed stay the same once the shard has been successfuly loaded/recovered. If one of the
shards of the same index is too big compared to the other shards (of the same index), you may be using
the routing feature in a sub-optimal way, creating a "hot shard" with too much data.

Thank you both for the detailed replies.

Tim.

--

Topic		Replies	Views
Indices total indexing time metric given by marvel Elasticsearch	6	1598	July 6, 2017
Restarting ES and slow recovery Elasticsearch	6	752	July 6, 2017
Upper limits on indexes/shards in a cluster Elasticsearch	11	1205	July 6, 2017
Slow Query Performance Elasticsearch	10	799	July 6, 2017
Debug indexing time Elasticsearch	3	140	April 25, 2024

Paramedic - what do the the values in the index shard details mean?

Related topics