Auto-optimize plugin & Cluster Singleton Plugin infrastructure

Hey all,

I was thinking of writing an ES plugin that could use defined off-peak
times to analyse indices in the cluster and automatically optimize them
based on a policy configuration (# segments, segment size etc). This way
an operator could defer the cost of optimizing to a defined quiet time of
the day to minimise the IO/CPU impact.

Firstly, I wonder if anyone else thinks this would be useful or maybe even
I've missed an existing cool ES feature that does something similar, but
for us, we leave the merge policy as is and review the _segments API and
periodically fire off an _optimize call with max_num_segments=1, and we try
to do this off peak to reduce impact. This seemed like a generally useful
feature to all ES users, so thought about a plugin for everyone to be able
to use.

This whole topic then brought up the problem of a plugin that really should
only be run 'once' - it's a cluster-wide singleton service. This previous
mail topic was of interest here:

http://elasticsearch-users.115913.n3.nabble.com/Using-river-for-cyclical-operations-on-the-cluster-td3727472.html

This is basically the same sort of thing I need. I've started delving into
the River code and I'm really a tad confused about where exactly within the
River code is mandating this cluster-wide bit.. ? Can someone point the
finger at it exactly? I've been trying to grok the RiversService and the
way ClusterStateChange events are handed about, and I sort of get the gut
feel that this is the way that a node is 'elected' to be the river
controller. If I read things right, only when the local node is the master
does the River run on this node, is that true?

Secondly while this could be implemented 'as a River', it sort of doesn't
'sound like what a River does'. Maybe ES needs a higher level abstraction
of a Cluster Wide Singleton Plugin, of which the River Service would be one
example. This way other plugins could leverage the infrastructure (like my
one).

Again, I could have missed something in the code. I'm hoping people with
experience could help me point to the right way to do this, and indeed,
whether it's worth even considering embarking on.

regards,

Paul

hi,take a look at this,
http://www.elasticsearch.org/guide/reference/index-modules/merge.html

From: Paul Smith
Sent: Wednesday, May 16, 2012 11:17 AM
To: elasticsearch@googlegroups.com
Subject: auto-optimize plugin & Cluster Singleton Plugin infrastructure

Hey all,

I was thinking of writing an ES plugin that could use defined off-peak times to analyse indices in the cluster and automatically optimize them based on a policy configuration (# segments, segment size etc). This way an operator could defer the cost of optimizing to a defined quiet time of the day to minimise the IO/CPU impact.

Firstly, I wonder if anyone else thinks this would be useful or maybe even I've missed an existing cool ES feature that does something similar, but for us, we leave the merge policy as is and review the _segments API and periodically fire off an _optimize call with max_num_segments=1, and we try to do this off peak to reduce impact. This seemed like a generally useful feature to all ES users, so thought about a plugin for everyone to be able to use.

This whole topic then brought up the problem of a plugin that really should only be run 'once' - it's a cluster-wide singleton service. This previous mail topic was of interest here:

http://elasticsearch-users.115913.n3.nabble.com/Using-river-for-cyclical-operations-on-the-cluster-td3727472.html

This is basically the same sort of thing I need. I've started delving into the River code and I'm really a tad confused about where exactly within the River code is mandating this cluster-wide bit.. ? Can someone point the finger at it exactly? I've been trying to grok the RiversService and the way ClusterStateChange events are handed about, and I sort of get the gut feel that this is the way that a node is 'elected' to be the river controller. If I read things right, only when the local node is the master does the River run on this node, is that true?

Secondly while this could be implemented 'as a River', it sort of doesn't 'sound like what a River does'. Maybe ES needs a higher level abstraction of a Cluster Wide Singleton Plugin, of which the River Service would be one example. This way other plugins could leverage the infrastructure (like my one).

Again, I could have missed something in the code. I'm hoping people with experience could help me point to the right way to do this, and indeed, whether it's worth even considering embarking on.

regards,

Paul

On 16 May 2012 13:28, medcl2000@gmail.com wrote:

hi,take a look at this,
Elasticsearch Platform — Find real-time answers at scale | Elastic

Yep, I've read that one, but the problem here is that these configs
(particularly the tiered one) try to provide a good merging algorithm that
is a trade-off of time/io. There will be cases, as I understand it where
one will get an index that has several fairly large segments in a shard,
and this would not be sufficient for performance reasons, and you'd want to
perform a 'fuller' optimize (perhaps to max_num_segments of 1), however the
fuller optimize is not something one can sustain during normal heavy
production times, and one would prefer it done out of hours, say in the
middle of the night when the traffic is low and the heavier merging can be
done with far less impact.

Otherwise one wouldn't need an _optimize API call at all right? because if
the merge policy was that good, there wouldn't need to be an optimize call
at all. For example, after a 31 million item bulk index, our searches were
a fair bit slower until we optimized the segments down.

Quoting from this link:

"Note, this can mean that for large shards that holds many gigabytes of
data, the default of max_merged_segment (5gb) can cause for many segments
to be in an index, and causing searches to be slower. Use the indices
segments API to see the segments that an index have, and possibly either
increase the max_merged_segment or issue an optimize call for the index
(try and aim to issue it on a low traffic time)."

I think there's legitimate cases for this still, but happy to keep this
discussion going. thanks for your input.

Paul

I agree, implementing something like that as a river is not proper. What
you can do is implement a service that gets injected with the cluster
state, and react to the cluster changes API to see if the current node is
the master, and if it is, the service should "start" to work. Have a look
at the RoutingService class that does something similar.

It can be cool to have a generic base class that hides this from the user,
and it will have an "onMaster" and "offMaster" callbacks to do its thing
(usually kick a scheduled job, so there can be another layer that does the
scheduling automatically as well).

On Wed, May 16, 2012 at 6:17 AM, Paul Smith tallpsmith@gmail.com wrote:

Hey all,

I was thinking of writing an ES plugin that could use defined off-peak
times to analyse indices in the cluster and automatically optimize them
based on a policy configuration (# segments, segment size etc). This way
an operator could defer the cost of optimizing to a defined quiet time of
the day to minimise the IO/CPU impact.

Firstly, I wonder if anyone else thinks this would be useful or maybe even
I've missed an existing cool ES feature that does something similar, but
for us, we leave the merge policy as is and review the _segments API and
periodically fire off an _optimize call with max_num_segments=1, and we try
to do this off peak to reduce impact. This seemed like a generally useful
feature to all ES users, so thought about a plugin for everyone to be able
to use.

This whole topic then brought up the problem of a plugin that really
should only be run 'once' - it's a cluster-wide singleton service. This
previous mail topic was of interest here:

http://elasticsearch-users.115913.n3.nabble.com/Using-river-for-cyclical-operations-on-the-cluster-td3727472.html

This is basically the same sort of thing I need. I've started delving
into the River code and I'm really a tad confused about where exactly
within the River code is mandating this cluster-wide bit.. ? Can someone
point the finger at it exactly? I've been trying to grok the RiversService
and the way ClusterStateChange events are handed about, and I sort of get
the gut feel that this is the way that a node is 'elected' to be the river
controller. If I read things right, only when the local node is the master
does the River run on this node, is that true?

Secondly while this could be implemented 'as a River', it sort of doesn't
'sound like what a River does'. Maybe ES needs a higher level abstraction
of a Cluster Wide Singleton Plugin, of which the River Service would be one
example. This way other plugins could leverage the infrastructure (like my
one).

Again, I could have missed something in the code. I'm hoping people with
experience could help me point to the right way to do this, and indeed,
whether it's worth even considering embarking on.

regards,

Paul

On 17 May 2012 08:11, Shay Banon kimchy@gmail.com wrote:

I agree, implementing something like that as a river is not proper. What
you can do is implement a service that gets injected with the cluster
state, and react to the cluster changes API to see if the current node is
the master, and if it is, the service should "start" to work. Have a look
at the RoutingService class that does something similar.

It can be cool to have a generic base class that hides this from the user,
and it will have an "onMaster" and "offMaster" callbacks to do its thing
(usually kick a scheduled job, so there can be another layer that does the
scheduling automatically as well).

I've started to put together a basic project structure laying out the
cluster-wide singleton service abstraction with delegate methods here:

https://github.com/Aconex/elasticsearch-autooptimize-plugin

specifically:
https://github.com/Aconex/elasticsearch-autooptimize-plugin/blob/master/src/main/java/com/aconex/elasticsearch/ClusterWideSingletonService.java

with the view that this project will then also incorporate the
auto-optimize code. Clearly not a lot here just yet, but I thought I'd
share with everyone in my head what the project will consist of for further
comment:

  • ClusterWideSingletonService - delegation-based inheritance class to
    simplify Services that can only run on a single node (the master is chosen
    as the single point).
  • ClusterWidePeriodicExecutor - an extension of
    ClusterWideSingletonService that sets up a Timer to run periodically
    (defaulting to once per day, starting at midnight) to execute a given
    Runnable class (which itself is configurable)
  • ClusterFragmentationDetector - sorts all indices in the cluster by
    their fragmentation level in descending fragmentation order and asks an
    IndexAutoOptimizer to work on each if required. The reason to sort this by
    most fragmented is that the configuration of this Service may have a
    start:end range for off-peak time, and the Service will attempt to work on
    the worst index first before moving on the next, and automatically exiting
    it's loop if the time has exceeded the end period of the off peak range.
    This is so the optimisation does not affect production times, the plugin
    will have another crack at other indices in the next time period. This is
    designed to try to tackle the worst first as a best effort approach.
  • IndexAutoOptimizer - uses an IndexFragmentationPolicy to detect
    fragmentation and invoke an IndexOptimizer object to do the work actual
    work.
  • IndexFragmentationPolicy - detects if a specified index is beyond
    the thresholds defined, initial implementations of this would probably
    detect based on # segments/segment sizes in a shard and flag if they go
    outside a range.
  • IndexOptimizer - given a specified configuration target # segments
    (defaulting to 1), invokes the Optimize API call on the specified index to
    take it to the target threshold.

What I'm hoping to achieve is a plugin that can be installed in any ES
cluster with built-in configuration that will probably work for a large
majority of people out of the box to self-tune a cluster's indices to a
good working state.

If anyone else would like to contribute to this, let me know!

thanks,

Paul Smith

I am missing some of the classes in the repo, only see
ClusterWideSingletonService, and IndexOptimizer. Regarding
ClusterWideSingletonService, I would have it accept in the constructor the
ClusterService, and add itself as a listener when its constructed.

On Thu, May 17, 2012 at 2:18 PM, Paul Smith tallpsmith@gmail.com wrote:

On 17 May 2012 08:11, Shay Banon kimchy@gmail.com wrote:

I agree, implementing something like that as a river is not proper. What
you can do is implement a service that gets injected with the cluster
state, and react to the cluster changes API to see if the current node is
the master, and if it is, the service should "start" to work. Have a look
at the RoutingService class that does something similar.

It can be cool to have a generic base class that hides this from the
user, and it will have an "onMaster" and "offMaster" callbacks to do its
thing (usually kick a scheduled job, so there can be another layer that
does the scheduling automatically as well).

I've started to put together a basic project structure laying out the
cluster-wide singleton service abstraction with delegate methods here:

https://github.com/Aconex/elasticsearch-autooptimize-plugin

specifically:
https://github.com/Aconex/elasticsearch-autooptimize-plugin/blob/master/src/main/java/com/aconex/elasticsearch/ClusterWideSingletonService.java

with the view that this project will then also incorporate the
auto-optimize code. Clearly not a lot here just yet, but I thought I'd
share with everyone in my head what the project will consist of for further
comment:

  • ClusterWideSingletonService - delegation-based inheritance class
    to simplify Services that can only run on a single node (the master is
    chosen as the single point).
  • ClusterWidePeriodicExecutor - an extension of
    ClusterWideSingletonService that sets up a Timer to run periodically
    (defaulting to once per day, starting at midnight) to execute a given
    Runnable class (which itself is configurable)
  • ClusterFragmentationDetector - sorts all indices in the cluster by
    their fragmentation level in descending fragmentation order and asks an
    IndexAutoOptimizer to work on each if required. The reason to sort this by
    most fragmented is that the configuration of this Service may have a
    start:end range for off-peak time, and the Service will attempt to work on
    the worst index first before moving on the next, and automatically exiting
    it's loop if the time has exceeded the end period of the off peak range.
    This is so the optimisation does not affect production times, the plugin
    will have another crack at other indices in the next time period. This is
    designed to try to tackle the worst first as a best effort approach.
  • IndexAutoOptimizer - uses an IndexFragmentationPolicy to detect
    fragmentation and invoke an IndexOptimizer object to do the work actual
    work.
  • IndexFragmentationPolicy - detects if a specified index is beyond
    the thresholds defined, initial implementations of this would probably
    detect based on # segments/segment sizes in a shard and flag if they go
    outside a range.
  • IndexOptimizer - given a specified configuration target # segments
    (defaulting to 1), invokes the Optimize API call on the specified index to
    take it to the target threshold.

What I'm hoping to achieve is a plugin that can be installed in any ES
cluster with built-in configuration that will probably work for a large
majority of people out of the box to self-tune a cluster's indices to a
good working state.

If anyone else would like to contribute to this, let me know!

thanks,

Paul Smith