Marvel creating disk usage imbalance

Duncan_Innes · November 11, 2014, 9:15am

I now know that Marvel creates a lot of data per day of monitoring - in our
case around 1Gb.

What I'm just starting to get my head around is the imbalance of disk usage
that this caused on my 5 node cluster.

I've now removed Marvel and deleted the indexes for now (great tool, but I
don't have the disk space to spare on this proof of concept) and my disk
usage for the 12 months of rsyslog data has equalised across all the nodes
in my cluster. When the Marvel data was sitting there, not only was I
using far too much disk space, but I was also seeing significant
differences between nodes. At least one node would be using nearly all of
the 32Gb, where other nodes would sit at half that or even less. Is there
something intrinsically different about Marvel's indexes that makes them
prone to such wild differences?

Thanks

Duncan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7c7d7fb3-a704-4ea5-a74d-efa01f1fa11d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael_Hart · November 11, 2014, 1:26pm

I think it's related to
this: Reroute shards automatically when high disk watermark is exceeded by dakrone · Pull Request #8270 · elastic/elasticsearch · GitHub which I
believe was released with 1.4.

We see the same thing, with hot spots on some nodes. You can poke the
cluster to rebalance itself, which that #8270 fixes permanently, using
"curl -XPOST localhost:9200/_cluster/reroute". That doesn't always sort it
out, and this issue
(Updates causing hotspots in cluster when multiple primary shards for an index exist on a single node · Issue #8149 · elastic/elasticsearch · GitHub) is our primary
issue.

AFAIK it's not just Marvel, but any indice can get into this situation.
Right now I have a few nodes with 1TB of free disk and others with 400Gb,
and Marvel is in another cluster entirely.

cheers
mike

On Tuesday, November 11, 2014 4:15:33 AM UTC-5, Duncan Innes wrote:

I now know that Marvel creates a lot of data per day of monitoring - in
our case around 1Gb.

What I'm just starting to get my head around is the imbalance of disk
usage that this caused on my 5 node cluster.

I've now removed Marvel and deleted the indexes for now (great tool, but I
don't have the disk space to spare on this proof of concept) and my disk
usage for the 12 months of rsyslog data has equalised across all the nodes
in my cluster. When the Marvel data was sitting there, not only was I
using far too much disk space, but I was also seeing significant
differences between nodes. At least one node would be using nearly all of
the 32Gb, where other nodes would sit at half that or even less. Is there
something intrinsically different about Marvel's indexes that makes them
prone to such wild differences?

Thanks

Duncan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/477f9c6c-b359-4776-83be-e8ac5ac8401a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Duncan_Innes · November 11, 2014, 1:43pm

Interesting - I thought I'd narrowed it down to Marvel. I had big
imbalances with Marvel running, now it all seems flat (although to be fair,
the disk usage has dropped to around 5Gb used in a 32Gb partition, so
there's large amounts of free space).

Same as you though - I could do nothing to rebalance the usage. We've
built the cluster so that nodes can be rebuilt and rejoin the cluster if
absolutely necessary. Even doing that didn't affect the balance.

Will keep an eye on those issues. Looks like there's still a chance of
disk imbalance though, if all disks are well below the water mark.
Although at least the issue looks to be addressed once disk use pops above
the high water mark.

Cheers

D

On Tuesday, 11 November 2014 13:26:01 UTC, Michael Hart wrote:

I think it's related to this:
Reroute shards automatically when high disk watermark is exceeded by dakrone · Pull Request #8270 · elastic/elasticsearch · GitHub which I believe
was released with 1.4.

We see the same thing, with hot spots on some nodes. You can poke the
cluster to rebalance itself, which that #8270 fixes permanently, using
"curl -XPOST localhost:9200/_cluster/reroute". That doesn't always sort it
out, and this issue (
Updates causing hotspots in cluster when multiple primary shards for an index exist on a single node · Issue #8149 · elastic/elasticsearch · GitHub) is our
primary issue.

AFAIK it's not just Marvel, but any indice can get into this situation.
Right now I have a few nodes with 1TB of free disk and others with 400Gb,
and Marvel is in another cluster entirely.

cheers
mike

On Tuesday, November 11, 2014 4:15:33 AM UTC-5, Duncan Innes wrote:

I now know that Marvel creates a lot of data per day of monitoring - in
our case around 1Gb.

What I'm just starting to get my head around is the imbalance of disk
usage that this caused on my 5 node cluster.

I've now removed Marvel and deleted the indexes for now (great tool, but
I don't have the disk space to spare on this proof of concept) and my disk
usage for the 12 months of rsyslog data has equalised across all the nodes
in my cluster. When the Marvel data was sitting there, not only was I
using far too much disk space, but I was also seeing significant
differences between nodes. At least one node would be using nearly all of
the 32Gb, where other nodes would sit at half that or even less. Is there
something intrinsically different about Marvel's indexes that makes them
prone to such wild differences?

Thanks

Duncan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3aeaae9e-95c4-4cee-b7dd-1c005e2b965f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Huge increase in Marvel data consumption Elasticsearch elastic-stack-monitoring	8	2356	July 6, 2017
Marvel Document Creation Concern Elasticsearch	3	277	July 6, 2017
Marvel high index rate Elasticsearch elastic-stack-monitoring	7	2198	July 6, 2017
Marvel reporting incorrect free disk space? Elasticsearch	9	1300	July 6, 2017
Marvel Shows Documents deleted Elasticsearch	7	450	July 6, 2017

Marvel creating disk usage imbalance

Related topics