Minimum disk space needed to account for segment merges

What is the minimum storage space Elasticsearch nodes should have to ensure
smooth merging of segments accounting for index optimize calls. From my
understanding, lucene needs 3X index size space [1] especially during
forced merges. Our use case is not memory bound. Does that mean I should
plan capacity based on only 1/3 total disk space?

[1] -
https://issues.apache.org/jira/browse/LUCENE-6386?focusedCommentId=14391162&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391162
Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It needs as much space as the segments it's merging, so if you have 2 x
10GB segments then you'd want at least 20GB free.

On 4 April 2015 at 12:28, Ankit Malpani ankit.malpani@gmail.com wrote:

What is the minimum storage space Elasticsearch nodes should have to
ensure smooth merging of segments accounting for index optimize calls. From
my understanding, lucene needs 3X index size space [1] especially during
forced merges. Our use case is not memory bound. Does that mean I should
plan capacity based on only 1/3 total disk space?

[1] -
https://issues.apache.org/jira/browse/LUCENE-6386?focusedCommentId=14391162&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391162
Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_PoTt4Vwy8g_h7XuMfdfi7voxMQeUFJOgxO5GQNaADwA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Dont we need to consider space for compound file formats? Worst case
scenario of CFS enabled, forceMerge to 1 segment would need 40GB free space
for 2 x 10GB segments?

Also index.merge.policy.max_merge_size defaults to unbounded, is there some
recommended numbers around this?

There are no disk based circuit breaker turned on by default and the index
just appears to go red when there is no free disk space. So I wanted to set
safeguards at my end to avoid this issue.

On Friday, 3 April 2015 20:30:05 UTC-7, Mark Walkom wrote:

It needs as much space as the segments it's merging, so if you have 2 x
10GB segments then you'd want at least 20GB free.

On 4 April 2015 at 12:28, Ankit Malpani <ankit....@gmail.com <javascript:>

wrote:

What is the minimum storage space Elasticsearch nodes should have to
ensure smooth merging of segments accounting for index optimize calls. From
my understanding, lucene needs 3X index size space [1] especially during
forced merges. Our use case is not memory bound. Does that mean I should
plan capacity based on only 1/3 total disk space?

[1] -
https://issues.apache.org/jira/browse/LUCENE-6386?focusedCommentId=14391162&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391162
Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/91b98b8b-30a7-4685-a78c-e31985924724%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

There are watermark's on disk use -
http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk

I don't follow why merging two 10GB segments needs 40GB though.

On 6 April 2015 at 10:27, Ankit Malpani ankit.malpani@gmail.com wrote:

Dont we need to consider space for compound file formats? Worst case
scenario of CFS enabled, forceMerge to 1 segment would need 40GB free space
for 2 x 10GB segments?

Also index.merge.policy.max_merge_size defaults to unbounded, is there
some recommended numbers around this?

There are no disk based circuit breaker turned on by default and the index
just appears to go red when there is no free disk space. So I wanted to set
safeguards at my end to avoid this issue.

On Friday, 3 April 2015 20:30:05 UTC-7, Mark Walkom wrote:

It needs as much space as the segments it's merging, so if you have 2 x
10GB segments then you'd want at least 20GB free.

On 4 April 2015 at 12:28, Ankit Malpani ankit....@gmail.com wrote:

What is the minimum storage space Elasticsearch nodes should have to
ensure smooth merging of segments accounting for index optimize calls. From
my understanding, lucene needs 3X index size space [1] especially during
forced merges. Our use case is not memory bound. Does that mean I should
plan capacity based on only 1/3 total disk space?

[1] - https://issues.apache.org/jira/browse/LUCENE-6386?
focusedCommentId=14391162&page=com.atlassian.jira.
plugin.system.issuetabpanels:comment-tabpanel#comment-14391162
Thanks,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/91b98b8b-30a7-4685-a78c-e31985924724%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/91b98b8b-30a7-4685-a78c-e31985924724%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_6ExoA1dN3%2BR11AoTNbMFuMgYDKupw%2BbyXJuD9TuHq-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Disk watermarks are of no use when all nodes on cluster are running low on
disk and it is the existing shards which receive continuous writes. It
would be great if ES can error such writes on low disk space rather than
letting the index go red.

Regarding need for 40GB free space for 2 x 10GB segment merges. Following
links will help
1.
http://lucene.apache.org/core/5_0_0/core/org/apache/lucene/index/IndexWriter.html#forceMerge(int)

https://issues.apache.org/jira/browse/LUCENE-6386?focusedCommentId=14391162&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391162

On Sunday, 5 April 2015 18:52:39 UTC-7, Mark Walkom wrote:

There are watermark's on disk use -
http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk
http://www.google.com/url?q=http%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Findex-modules-allocation.html%23disk&sa=D&sntz=1&usg=AFQjCNGTs5FxlleErJnH4vXJbetVNfMP4g

I don't follow why merging two 10GB segments needs 40GB though.

On 6 April 2015 at 10:27, Ankit Malpani <ankit....@gmail.com <javascript:>

wrote:

Dont we need to consider space for compound file formats? Worst case
scenario of CFS enabled, forceMerge to 1 segment would need 40GB free space
for 2 x 10GB segments?

Also index.merge.policy.max_merge_size defaults to unbounded, is there
some recommended numbers around this?

There are no disk based circuit breaker turned on by default and the
index just appears to go red when there is no free disk space. So I wanted
to set safeguards at my end to avoid this issue.

On Friday, 3 April 2015 20:30:05 UTC-7, Mark Walkom wrote:

It needs as much space as the segments it's merging, so if you have 2 x
10GB segments then you'd want at least 20GB free.

On 4 April 2015 at 12:28, Ankit Malpani ankit....@gmail.com wrote:

What is the minimum storage space Elasticsearch nodes should have to
ensure smooth merging of segments accounting for index optimize calls. From
my understanding, lucene needs 3X index size space [1] especially during
forced merges. Our use case is not memory bound. Does that mean I should
plan capacity based on only 1/3 total disk space?

[1] - https://issues.apache.org/jira/browse/LUCENE-6386?
focusedCommentId=14391162&page=com.atlassian.jira.
plugin.system.issuetabpanels:comment-tabpanel#comment-14391162
https://www.google.com/url?q=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FLUCENE-6386%3FfocusedCommentId%3D14391162%26page%3Dcom.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel%23comment-14391162&sa=D&sntz=1&usg=AFQjCNGHr8Pp6zzozwsTMAuG6ErBFKkF3g
Thanks,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/056ae7cf-6d54-4bc7-bd15-9eb99173c347%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/91b98b8b-30a7-4685-a78c-e31985924724%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/91b98b8b-30a7-4685-a78c-e31985924724%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f310ec7e-9fde-4534-8516-d260079d0925%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.