Switching back to ConcurrentMergeScheduler


(David Smith-2) #1

I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it
affecting indexing performance in 1.1.0.

We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a
way to switch it back using the API? I tried the following command, but it
seems to not take.

curl -i -XPUT localhost:9200/_cluster/settings -d '{ "persistent": {
"index.merge.scheduler.type":
"org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider"
} }'
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 52

{"acknowledged":true,"persistent":{},"transient":{}}

It does not seem to be set when I try to re-GET it (and no errors in logs
at DEBUG level or above).

curl -i -XGET localhost:9200/_cluster/settings
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 66

{"persistent":{"threadpool":{"bulk":{"size":"8"}}},"transient":{}}

Am using the wrong way of specifying the scheduler? I also tried just
specifying ConcurrentMergeSchedulerProvider instead of the full class name,
but that didn't work.

Any ideas?
David

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

I use this on 1.1.0 in my config/elasticsearch.yml

index:
merge:
scheduler:
type: concurrent
max_thread_count: 4
policy:
type: tiered
max_merged_segment: 1gb
segments_per_tier: 4
max_merge_at_once: 4
max_merge_at_once_explicit: 4

threadpool:
merge:
type: fixed
size: 4
queue_size: 32

Explanation:

  • use concurrent scheduler and limit it to 4 threads. I find 4 threads
    being able to keep up with the highest bulk insertion rate I could generate
  • use tiered policy (the default, it is most flexible in selecting segments
    to merge)
  • create segments less than 1gb in a tier (this limits the file size of the
    segments files, the smaller the files, the faster the merges, but the more
    files are created)
  • create 4 segments per tier (do not create segments numbers that are too
    high per tier)
  • merge 4 segments at each merge step (this limits the total run time and
    resource consumption of a segment merge step)
  • also limit merge for explicit _optimize API call
  • extend thread pool to 4 merge threads with a maximum of 32 merge
    operations in the queue (32 should be sufficient to handle outstanding
    merges)

From time to time, if the number of files get very high (>500) and index is
calm (no indexing, no heavy search), I do a manual _optimize.

Jörg

On Fri, Apr 18, 2014 at 9:01 PM, David Smith davidksmith2k@gmail.comwrote:

I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it
affecting indexing performance in 1.1.0.
https://github.com/elasticsearch/elasticsearch/issues/5817

We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a
way to switch it back using the API? I tried the following command, but it
seems to not take.

curl -i -XPUT localhost:9200/_cluster/settings -d '{ "persistent": {
"index.merge.scheduler.type":
"org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider"
} }'
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 52

{"acknowledged":true,"persistent":{},"transient":{}}

It does not seem to be set when I try to re-GET it (and no errors in logs
at DEBUG level or above).

curl -i -XGET localhost:9200/_cluster/settings
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 66

{"persistent":{"threadpool":{"bulk":{"size":"8"}}},"transient":{}}

Am using the wrong way of specifying the scheduler? I also tried just
specifying ConcurrentMergeSchedulerProvider instead of the full class
name, but that didn't work.

Any ideas?
David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGwnPYyBPYRSPz5c9WGzfH68CHX7gXb7UwmgMbwXdOnMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Smith-2) #3

Thanks, Jörg. Is it possible to set these via API instead of changing the
yaml?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bbf0936b-84eb-4e4a-b7c4-7fa2dcaad963%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

No, you can not change the merge scheduler settings via API. Threadpool
settings updating works.

Jörg

On Sat, Apr 19, 2014 at 3:22 PM, David Smith davidksmith2k@gmail.comwrote:

Thanks, Jörg. Is it possible to set these via API instead of changing the
yaml?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bbf0936b-84eb-4e4a-b7c4-7fa2dcaad963%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bbf0936b-84eb-4e4a-b7c4-7fa2dcaad963%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF_gH1iZsws1mNqNjtxLnkoEpn14Fs6o0DD5XtMqrKc7g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Smith-2) #5

Ahh, got it. Thanks.

On Saturday, April 19, 2014 10:05:39 AM UTC-4, Jörg Prante wrote:

No, you can not change the merge scheduler settings via API. Threadpool
settings updating works.

Jörg

On Sat, Apr 19, 2014 at 3:22 PM, David Smith <davidk...@gmail.com<javascript:>

wrote:

Thanks, Jörg. Is it possible to set these via API instead of changing the
yaml?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bbf0936b-84eb-4e4a-b7c4-7fa2dcaad963%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bbf0936b-84eb-4e4a-b7c4-7fa2dcaad963%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/040d8e5d-5a1a-4ffb-8c48-b39d59c14245%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6