Changing Merge Policy And Optimization

Hi All
I indexed some data in elasticsearch and have a index of around 18 Gb ,
with 10 Shards for the index, with 0 replicas and using the default
"tiered" merge policy.
I issued a Optimize command , the number of segments came down to 22-30 for
each shard , which does not seems to go down any further , irrespective of
whether i specify the max_num_segments in the optimize call or not.

I am thinking of changing the merge policy to log_doc, (after closing the
index) , but not able to find some proper documentation .
My main aim is improving the search performance , by bringing down the
number of segments for each shard , which i am thinking to bring in effect
by changing the merge policy and then issuing an optimize call.

Also , the argument that is sent along the optimize call , max_num_segments
, is it for max segments for a shard or an index ?

i tried changing the settings with "index.merge_policy":"log_doc" and
index.merge.policy":"log_doc"

but it accepts both the parameters , does not returns which one is valid
and in effect , even if i specify the wrong input , i.e "loc_doc_123".

Could somebody please help me resolve the issue ?

Thanks
Tarang Dawer

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Lots of questions. First, performance is a big topic, why have you decided
that merge optimization is the right way to achieve your goals? You may be
dissapointed.

Segment optimization does achieve a measurable benefit in our production
setup but its behind a lot of other things that we tried, number of cpus,
memory size, vm memory, query structure, etc.

That said, the segment optimize works on shards--each shard is a lucene
index and the optimize is passed on to lucene. You can often read the
lucene documentation to understand better the tradeoffs that are happening
underneath the elasticsearch covers.

You usually need to issue the optimize twice. This is a bug as far as I
can tell. It never reports failure.

I have not tried changing the merge policy on an existing index but I would
not be surprised if you need to rebuild the index to get any benefit.

Note that as soon as you add docs to your newly optimized index the
benefits of optimization start to be lost...so this really only makes sense
on a query-mostly index.

Good luck,
Randy

On Wed, Apr 24, 2013 at 12:58 AM, Tarang Dawer tarang.dawer@gmail.comwrote:

Hi All
I indexed some data in elasticsearch and have a index of around 18 Gb ,
with 10 Shards for the index, with 0 replicas and using the default
"tiered" merge policy.
I issued a Optimize command , the number of segments came down to 22-30
for each shard , which does not seems to go down any further , irrespective
of whether i specify the max_num_segments in the optimize call or not.

I am thinking of changing the merge policy to log_doc, (after closing the
index) , but not able to find some proper documentation .
My main aim is improving the search performance , by bringing down the
number of segments for each shard , which i am thinking to bring in effect
by changing the merge policy and then issuing an optimize call.

Also , the argument that is sent along the optimize call ,
max_num_segments , is it for max segments for a shard or an index ?

i tried changing the settings with "index.merge_policy":"log_doc" and
index.merge.policy":"log_doc"

but it accepts both the parameters , does not returns which one is valid
and in effect , even if i specify the wrong input , i.e "loc_doc_123".

Could somebody please help me resolve the issue ?

Thanks
Tarang Dawer

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

@Randy is there a specific way to invoke optimize in your case, or is it a
blank optimize call?

On Friday, April 26, 2013 3:56:47 AM UTC+4, RKM wrote:

Lots of questions. First, performance is a big topic, why have you decided
that merge optimization is the right way to achieve your goals? You may be
dissapointed.

Segment optimization does achieve a measurable benefit in our production
setup but its behind a lot of other things that we tried, number of cpus,
memory size, vm memory, query structure, etc.

That said, the segment optimize works on shards--each shard is a lucene
index and the optimize is passed on to lucene. You can often read the
lucene documentation to understand better the tradeoffs that are happening
underneath the elasticsearch covers.

You usually need to issue the optimize twice. This is a bug as far as I
can tell. It never reports failure.

I have not tried changing the merge policy on an existing index but I
would not be surprised if you need to rebuild the index to get any benefit.

Note that as soon as you add docs to your newly optimized index the
benefits of optimization start to be lost...so this really only makes sense
on a query-mostly index.

Good luck,
Randy

On Wed, Apr 24, 2013 at 12:58 AM, Tarang Dawer <tarang...@gmail.com<javascript:>

wrote:

Hi All
I indexed some data in elasticsearch and have a index of around 18 Gb ,
with 10 Shards for the index, with 0 replicas and using the default
"tiered" merge policy.
I issued a Optimize command , the number of segments came down to 22-30
for each shard , which does not seems to go down any further , irrespective
of whether i specify the max_num_segments in the optimize call or not.

I am thinking of changing the merge policy to log_doc, (after closing the
index) , but not able to find some proper documentation .
My main aim is improving the search performance , by bringing down the
number of segments for each shard , which i am thinking to bring in effect
by changing the merge policy and then issuing an optimize call.

Also , the argument that is sent along the optimize call ,
max_num_segments , is it for max segments for a shard or an index ?

i tried changing the settings with "index.merge_policy":"log_doc" and
index.merge.policy":"log_doc"

but it accepts both the parameters , does not returns which one is valid
and in effect , even if i specify the wrong input , i.e "loc_doc_123".

Could somebody please help me resolve the issue ?

Thanks
Tarang Dawer

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I'm referring to optimize when setting max_num_segments, e.g.

curl -XPOST "http://server:9200/myindex/_optimize?max_num_segments=3"

On Wed, May 1, 2013 at 5:05 AM, Mo mohammady.mahdy@gmail.com wrote:

@Randy is there a specific way to invoke optimize in your case, or is it a
blank optimize call?

On Friday, April 26, 2013 3:56:47 AM UTC+4, RKM wrote:

Lots of questions. First, performance is a big topic, why have you
decided that merge optimization is the right way to achieve your goals? You
may be dissapointed.

Segment optimization does achieve a measurable benefit in our production
setup but its behind a lot of other things that we tried, number of cpus,
memory size, vm memory, query structure, etc.

That said, the segment optimize works on shards--each shard is a lucene
index and the optimize is passed on to lucene. You can often read the
lucene documentation to understand better the tradeoffs that are happening
underneath the elasticsearch covers.

You usually need to issue the optimize twice. This is a bug as far as I
can tell. It never reports failure.

I have not tried changing the merge policy on an existing index but I
would not be surprised if you need to rebuild the index to get any benefit.

Note that as soon as you add docs to your newly optimized index the
benefits of optimization start to be lost...so this really only makes sense
on a query-mostly index.

Good luck,
Randy

On Wed, Apr 24, 2013 at 12:58 AM, Tarang Dawer tarang...@gmail.comwrote:

Hi All
I indexed some data in elasticsearch and have a index of around 18 Gb ,
with 10 Shards for the index, with 0 replicas and using the default
"tiered" merge policy.
I issued a Optimize command , the number of segments came down to 22-30
for each shard , which does not seems to go down any further , irrespective
of whether i specify the max_num_segments in the optimize call or not.

I am thinking of changing the merge policy to log_doc, (after closing
the index) , but not able to find some proper documentation .
My main aim is improving the search performance , by bringing down the
number of segments for each shard , which i am thinking to bring in effect
by changing the merge policy and then issuing an optimize call.

Also , the argument that is sent along the optimize call ,
max_num_segments , is it for max segments for a shard or an index ?

i tried changing the settings with "index.merge_policy":"log_doc" and
index.merge.policy":"log_doc"

but it accepts both the parameters , does not returns which one is valid
and in effect , even if i specify the wrong input , i.e "loc_doc_123".

Could somebody please help me resolve the issue ?

Thanks
Tarang Dawer

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.