How to set max_merged_segment at startup?


(dottom) #1

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
    "index" : {
        "merge.policy.max_merged_segment" : "2g"
    }
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
  number_of_shards: 1
  number_of_replicas: 0
  merge.policy.max_merged_segment: 2g

This does not work either:

index:
  number_of_shards: 1
  number_of_replicas: 0

  merge:
    policy:
      max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Shay Banon) #2

When you set it in the configuration file, then it will be applied if that
node is the master, and the index gets created. You can also specify the
same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup, the
setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dottom@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Shay Banon) #3

Here is the issue:
https://github.com/elasticsearch/elasticsearch/issues/1280.

On Fri, Aug 26, 2011 at 5:14 PM, Shay Banon kimchy@gmail.com wrote:

When you set it in the configuration file, then it will be applied if that
node is the master, and the index gets created. You can also specify the
same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup, the
setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dottom@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Huy Le) #4

Anyway to change merge policy (from tiered to log_byte_size) without
downtime? Our production cluster currently have single master. If
configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on a
single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be applied if that
node is the master, and the index gets created. You can also specify the
same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup, the
setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Shay Banon) #5

There isn't an option to change the merge policy type without either
changing it in the settings and restarting the cluster. Or, closing hte
index, updating the merge policy type, and opening it again. Why do you want
to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le huyle@leveragingit.com wrote:

Anyway to change merge policy (from tiered to log_byte_size) without
downtime? Our production cluster currently have single master. If
configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on a
single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be applied if
that
node is the master, and the index gets created. You can also specify the
same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup, the
setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Huy Le) #6

We have solr with acceptable performance with log_byte_size merge
policy. We would like to try a fast track to acceptable performance
on ES cluster by setting merge policy on ES to the same setting we
currently on our solr cluster. The issue we are experiencing is that
there are so many (over 100) small segment files per shard on a 30-
shard cluster that cause ES to spend too much CPU time on merging.

Huy

On Sep 20, 4:29 am, Shay Banon kim...@gmail.com wrote:

There isn't an option to change the merge policy type without either
changing it in the settings and restarting the cluster. Or, closing hte
index, updating the merge policy type, and opening it again. Why do you want
to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le hu...@leveragingit.com wrote:

Anyway to change merge policy (from tiered to log_byte_size) without
downtime? Our production cluster currently have single master. If
configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on a
single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be applied if
that
node is the master, and the index gets created. You can also specify the
same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup, the
setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Shay Banon) #7

Interesting... . Those should be merged, unless they are above the
max_merged_segment parameter, is that the case?

I pointed to before, just want to make sure though. You can change the merge
policy type by closing the index, using update settings to change the merge
policy type setting, and open the index again.

On Tue, Sep 20, 2011 at 3:27 PM, Huy Le huyle@leveragingit.com wrote:

We have solr with acceptable performance with log_byte_size merge
policy. We would like to try a fast track to acceptable performance
on ES cluster by setting merge policy on ES to the same setting we
currently on our solr cluster. The issue we are experiencing is that
there are so many (over 100) small segment files per shard on a 30-
shard cluster that cause ES to spend too much CPU time on merging.

Huy

On Sep 20, 4:29 am, Shay Banon kim...@gmail.com wrote:

There isn't an option to change the merge policy type without either
changing it in the settings and restarting the cluster. Or, closing hte
index, updating the merge policy type, and opening it again. Why do you
want
to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le hu...@leveragingit.com wrote:

Anyway to change merge policy (from tiered to log_byte_size) without
downtime? Our production cluster currently have single master. If
configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on a
single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be applied if
that
node is the master, and the index gets created. You can also specify
the

same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup,
the

setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Huy Le) #8

The files are under max_merged_segment.

By closing the index, wouldn't that prevent searching from working?

Alternatively, if we have more than 1 master, if we change config in
master 1, restart master 1, and change config in master 2 and restart
master, then restart the rest of cluster, would that work? Related
question, what is the procedure to changing a cluster from single
master to more than one master?

Thanks!

Huy

On Sep 20, 10:28 am, Shay Banon kim...@gmail.com wrote:

Interesting... . Those should be merged, unless they are above the
max_merged_segment parameter, is that the case?

I pointed to before, just want to make sure though. You can change the merge
policy type by closing the index, using update settings to change the merge
policy type setting, and open the index again.

On Tue, Sep 20, 2011 at 3:27 PM, Huy Le hu...@leveragingit.com wrote:

We have solr with acceptable performance with log_byte_size merge
policy. We would like to try a fast track to acceptable performance
on ES cluster by setting merge policy on ES to the same setting we
currently on our solr cluster. The issue we are experiencing is that
there are so many (over 100) small segment files per shard on a 30-
shard cluster that cause ES to spend too much CPU time on merging.

Huy

On Sep 20, 4:29 am, Shay Banon kim...@gmail.com wrote:

There isn't an option to change the merge policy type without either
changing it in the settings and restarting the cluster. Or, closing hte
index, updating the merge policy type, and opening it again. Why do you
want
to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le hu...@leveragingit.com wrote:

Anyway to change merge policy (from tiered to log_byte_size) without
downtime? Our production cluster currently have single master. If
configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on a
single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be applied if
that
node is the master, and the index gets created. You can also specify
the

same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on startup,
the

setting is misnamed, can called: max_merge_segment (merge instead of
merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works as
expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default at
startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Shay Banon) #9

Closing the index will cause search to not work, but thats the only way to
change the merge policy type.

I think you misunderstand how elasticsearch works. In a cluster, there will
be a single master.

On Thu, Sep 22, 2011 at 8:28 PM, Huy Le huyle@leveragingit.com wrote:

The files are under max_merged_segment.

By closing the index, wouldn't that prevent searching from working?

Alternatively, if we have more than 1 master, if we change config in
master 1, restart master 1, and change config in master 2 and restart
master, then restart the rest of cluster, would that work? Related
question, what is the procedure to changing a cluster from single
master to more than one master?

Thanks!

Huy

On Sep 20, 10:28 am, Shay Banon kim...@gmail.com wrote:

Interesting... . Those should be merged, unless they are above the
max_merged_segment parameter, is that the case?

I pointed to before, just want to make sure though. You can change the
merge
policy type by closing the index, using update settings to change the
merge
policy type setting, and open the index again.

On Tue, Sep 20, 2011 at 3:27 PM, Huy Le hu...@leveragingit.com wrote:

We have solr with acceptable performance with log_byte_size merge
policy. We would like to try a fast track to acceptable performance
on ES cluster by setting merge policy on ES to the same setting we
currently on our solr cluster. The issue we are experiencing is that
there are so many (over 100) small segment files per shard on a 30-
shard cluster that cause ES to spend too much CPU time on merging.

Huy

On Sep 20, 4:29 am, Shay Banon kim...@gmail.com wrote:

There isn't an option to change the merge policy type without
either

changing it in the settings and restarting the cluster. Or, closing
hte

index, updating the merge policy type, and opening it again. Why do
you

want

to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le hu...@leveragingit.com
wrote:

Anyway to change merge policy (from tiered to log_byte_size)
without

downtime? Our production cluster currently have single master.
If

configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on
a

single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be
applied if

that

node is the master, and the index gets created. You can also
specify

the

same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on
startup,

the

setting is misnamed, can called: max_merge_segment (merge instead
of

merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com
wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works
as

expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default
at

startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(Huy Le) #10

Will do index close/open approach for changing settings.

Regarding master. I thought there was single master too. But then I
saw a thread some where saying there were more than 1 master. Thanks
for clarification.

Huy

On Sep 23, 10:07 am, Shay Banon kim...@gmail.com wrote:

Closing the index will cause search to not work, but thats the only way to
change the merge policy type.

I think you misunderstand how elasticsearch works. In a cluster, there will
be a single master.

On Thu, Sep 22, 2011 at 8:28 PM, Huy Le hu...@leveragingit.com wrote:

The files are under max_merged_segment.

By closing the index, wouldn't that prevent searching from working?

Alternatively, if we have more than 1 master, if we change config in
master 1, restart master 1, and change config in master 2 and restart
master, then restart the rest of cluster, would that work? Related
question, what is the procedure to changing a cluster from single
master to more than one master?

Thanks!

Huy

On Sep 20, 10:28 am, Shay Banon kim...@gmail.com wrote:

Interesting... . Those should be merged, unless they are above the
max_merged_segment parameter, is that the case?

I pointed to before, just want to make sure though. You can change the
merge
policy type by closing the index, using update settings to change the
merge
policy type setting, and open the index again.

On Tue, Sep 20, 2011 at 3:27 PM, Huy Le hu...@leveragingit.com wrote:

We have solr with acceptable performance with log_byte_size merge
policy. We would like to try a fast track to acceptable performance
on ES cluster by setting merge policy on ES to the same setting we
currently on our solr cluster. The issue we are experiencing is that
there are so many (over 100) small segment files per shard on a 30-
shard cluster that cause ES to spend too much CPU time on merging.

Huy

On Sep 20, 4:29 am, Shay Banon kim...@gmail.com wrote:

There isn't an option to change the merge policy type without
either

changing it in the settings and restarting the cluster. Or, closing
hte

index, updating the merge policy type, and opening it again. Why do
you

want

to change the type?

On Tue, Sep 20, 2011 at 5:52 AM, Huy Le hu...@leveragingit.com
wrote:

Anyway to change merge policy (from tiered to log_byte_size)
without

downtime? Our production cluster currently have single master.
If

configuration in file is only taken on startup of master node, we
won't be able to get this configuration to be read by the master on
a

single master node as when the master is leaving the cluster, a new
node is promoted to be the new master.

Thanks!

Huy

On Aug 26, 10:14 am, Shay Banon kim...@gmail.com wrote:

When you set it in the configuration file, then it will be
applied if

that

node is the master, and the index gets created. You can also
specify

the

same setting when you create the index as an index setting.

I see where the problem is..., in the configuration level on
startup,

the

setting is misnamed, can called: max_merge_segment (merge instead
of

merged), I will fix that.

On Thu, Aug 25, 2011 at 3:01 AM, Tom Le dot...@gmail.com
wrote:

Is there a way to set max_merged_segment (or any of the merge
settings) at startup?

I can set max_merged_segment dynamically like this and it works
as

expected:

curl -XPUT localhost:9200/myindex/_settings -d '{
"index" : {
"merge.policy.max_merged_segment" : "2g"
}
}'

What I would like to do is set these for all indexes by default
at

startup, but this does not set it:

index:
number_of_shards: 1
number_of_replicas: 0
merge.policy.max_merged_segment: 2g

This does not work either:

index:
number_of_shards: 1
number_of_replicas: 0

 merge:
   policy:
     max_merged_segment: 2g

I have also tried substituting "max_merge_size" for
"max_merged_segment".


(system) #11