Total.store.size_in_bytes measures what?

Ryan_Pedela · January 15, 2014, 6:57pm

What does "total.store.size_in_bytes" from the indices stats APIhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.htmlmeasure?

Disk space for the Lucene index?
Disk space for the _source data?
Disk space for logs and other metadata?
Does it count shards?
Does it count replicas?
Anything else?

I know this question has been asked multiple times, but I have not been
able to find a succinct breakdown of what is exactly involved in the
calculation. The actual equation would be even better!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/08534879-a398-4ebf-890a-09ba7bac52c8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan_Pedela · January 15, 2014, 7:03pm

If the calculation includes _source data, does it use the uncompressed or
compressed size?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba08f741-42ac-4ff1-8439-3cbf582481ec%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · January 16, 2014, 9:38am

Hey,

from a quick peek in the source, the StoreStats are generated in
Store.stats(), which uses the Lucene Index Directory to get its size. Which
again calls file.length() for each file in that directory in the end. So it
is the size used by a lucene index in bytes.

The indices stats API shows the data for total shards, for primaries only
or for replicas and the same per index - so you can decide which data is
important to you to count in your monitoring system.

Does it make more sense now?

--Alex

On Wed, Jan 15, 2014 at 7:57 PM, Ryan Pedela rpedela@datalanche.com wrote:

What does "total.store.size_in_bytes" from the indices stats APIhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.htmlmeasure?

Disk space for the Lucene index?
Disk space for the _source data?
Disk space for logs and other metadata?
Does it count shards?
Does it count replicas?
Anything else?

I know this question has been asked multiple times, but I have not been
able to find a succinct breakdown of what is exactly involved in the
calculation. The actual equation would be even better!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/08534879-a398-4ebf-890a-09ba7bac52c8%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-FyicftTP-gS2jG1e31ZLejkOxq%2BvmvtsNf%3D6L6qXWKg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan_Pedela · January 16, 2014, 2:54pm

Yes, it makes more sense. Thanks.

Thanks,

Ryan Pedela
Datalanche CEO, co-founder
www.datalanche.com
rpedela@datalanche.com
513-571-6837

On Thu, Jan 16, 2014 at 2:38 AM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

from a quick peek in the source, the StoreStats are generated in
Store.stats(), which uses the Lucene Index Directory to get its size. Which
again calls file.length() for each file in that directory in the end. So it
is the size used by a lucene index in bytes.

The indices stats API shows the data for total shards, for primaries only
or for replicas and the same per index - so you can decide which data is
important to you to count in your monitoring system.

Does it make more sense now?

--Alex

On Wed, Jan 15, 2014 at 7:57 PM, Ryan Pedela rpedela@datalanche.comwrote:

What does "total.store.size_in_bytes" from the indices stats APIhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.htmlmeasure?

Disk space for the Lucene index?
Disk space for the _source data?
Disk space for logs and other metadata?
Does it count shards?
Does it count replicas?
Anything else?

I know this question has been asked multiple times, but I have not been
able to find a succinct breakdown of what is exactly involved in the
calculation. The actual equation would be even better!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/08534879-a398-4ebf-890a-09ba7bac52c8%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/ud9UAxFNMd0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-FyicftTP-gS2jG1e31ZLejkOxq%2BvmvtsNf%3D6L6qXWKg%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACu89FQ5CCSMc8zR8nrKj%2BACLrkd-ikMM6td7D37NfLoUXf5bQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan_Pedela · January 16, 2014, 5:52pm

I did more digging. Turns out that using version 0.90.9, the _source data
is included in the calculation. In other words, the stats are the entire
disk space used by an index including source data. And it is broken down by
indices, primaries, etc as Alex said.

I did not test to see if it takes into account source data compression, but
it appears that it does.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5210cf67-4cf7-4695-89c3-0b3fb48a6290%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · January 20, 2014, 9:15am

Hey,

the source is just a field in the index, thats the reason for being
included. What is not included is the something like the translog, so it is
not the entire disk space used by an index is in there iirc.

--Alex

On Thu, Jan 16, 2014 at 6:52 PM, Ryan Pedela rpedela@datalanche.com wrote:

I did more digging. Turns out that using version 0.90.9, the _source data
is included in the calculation. In other words, the stats are the entire
disk space used by an index including source data. And it is broken down by
indices, primaries, etc as Alex said.

I did not test to see if it takes into account source data compression,
but it appears that it does.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5210cf67-4cf7-4695-89c3-0b3fb48a6290%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-xucydRcjJcZmb8zRcCQX2OzJHB%2BcDck8w%3Dq8FhHpHQA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Lucene vs elasticsearch file size Elasticsearch	5	355	July 6, 2017
Total size calculated from store stats doesn't match what's on disk? Elasticsearch	2	479	March 24, 2017
Different size_in_bytes for the same index Elasticsearch	1	351	December 27, 2021
Cat api Elasticsearch	4	522	July 6, 2017
What does size_in_bytes include? Elasticsearch	3	2311	July 5, 2017

Total.store.size_in_bytes measures what?

Related topics