Curious here on what is the recommended way of plugin deployment in a
cluster, for redundancy on that plugin. Say I have 10 nodes... I deploy a
plugin on Node1. Node1 dies. Now what? I may have a new master node, but my
plugin is gone.
Is there an idea of deploying plugins on every node, and having only the
elected master be the "active" plugin?
Excuse me if it's a dumb question. I can't seem to find the answer
anywhere.
Curious here on what is the recommended way of plugin deployment in a cluster, for redundancy on that plugin. Say I have 10 nodes... I deploy a plugin on Node1. Node1 dies. Now what? I may have a new master node, but my plugin is gone.
Is there an idea of deploying plugins on every node, and having only the elected master be the "active" plugin?
Excuse me if it's a dumb question. I can't seem to find the answer anywhere.
Thanks for replying. The problem is that if you have a plugin that lets say
is responsible for polling ES _/node/stats and pushing the data
"somewhere", you don't want 100 nodes doing the same thing (that's a rough
example). It just seems like it's a central point of failure, but if
deployed per-node, you end up with duplicate data or duplicate service
calls with #X-simultaneously-running plugins.
You have written a lot of plugins/rivers. Are your meant to be deployed
singularly?
On Tuesday, January 21, 2014 2:34:49 PM UTC-5, David Pilato wrote:
What's wrong with installing plugins in all nodes?
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 21 janv. 2014 à 19:57, Roy Russo <royr...@gmail.com <javascript:>> a
écrit :
Hello all,
Curious here on what is the recommended way of plugin deployment in a
cluster, for redundancy on that plugin. Say I have 10 nodes... I deploy a
plugin on Node1. Node1 dies. Now what? I may have a new master node, but my
plugin is gone.
Is there an idea of deploying plugins on every node, and having only the
elected master be the "active" plugin?
Excuse me if it's a dumb question. I can't seem to find the answer
anywhere.
You must deploy plugins to all nodes, so you will still have the plugin
available when nodes come and go. You could add an action so that you can
turn on/off a plugin explicitly by remote command. Or you can define a
condition so a plugin can decide when to run. For example if you want a
singleton, you can check if you run on the master node, and execute only
then.
if you are in fact talking about implementation of monitoring system for ES
then the following are my 2 cents:
regarding how to poll data from ES nodes, I would not implement it as a
"plugin" or anything similar that is dependent on ES platform itself. It
should be an external process and possibly very light process (Java process
is not light). The impact on ES should be as minimal as possible and it
should not rely on plugin system of ES either. I would consider looking at
something like RRD tools (and extensions on top of it). Really lightweight
process that sits on each node, kicks once a while, grabs all metrics via
REST API and stores it into local storage. It should then try to send these
data into central store but it may not be always available thus storing it
locally is important in case the connection to central store is restored at
later time. The other argument why not to use plugin system of ES is that
you should be able to upgrade monitoring system independently on ES cluster
you can not do this with ES plugins.
should it poll the same stats from every single node? I think it should!
If ES cluster gets into trouble is can be because not every part of the
cluster sees "the same picture". IMO the only way how to learn about this
is collecting individual "pictures from every node". It is also the most
simple way of doing it (and you should definitely aim for simplicity).
Solving duplicities on the central storage side later should be possible
but you might not even need to do it if the size of collected data is not a
problem for you. The part where you need to be very careful when polling
the same stats from each node is making sure this is not putting high load
on the ES cluster. If it is possible to poll from the local node only (like
using _local in URL) then opt for it. Though I am no sure if this option is
still available post 1.0.0.RC1 release.
You must deploy plugins to all nodes, so you will still have the plugin
available when nodes come and go. You could add an action so that you can
turn on/off a plugin explicitly by remote command. Or you can define a
condition so a plugin can decide when to run. For example if you want a
singleton, you can check if you run on the master node, and execute only
then.
Somewhat... The idea is a monitoring 'service'. Hosted. Where its hosted is
a separate conversation.
I agree with most everything you stated. Having a local agent running is
the way to go. I'm avoiding remote polling and little to no advantage in
distributing as a plugin. The one advantage is that it is very clean and
simple to install, but such a tight dependency has its downsides, as you
mentioned.
I don't agree on not using java. Folks using elasticsearch already have a
JVM installed obviously, so its an easy jar drop and run. I'm not sure why
you would consider it a heavy process. Have you been looking at my old
jboss code? rrd is lighter. No objection there. So is Perl.
At first the data from elasticsearch is meant to be condensed and sent
across the wire... Small payloads. At some point, I may move to storing the
full blobs, but that can task a system and certainly add to hardware costs
for retention. Across x,000 clusters, storing monitoring metrics form
histograms and analysis can get heavy.
if you are in fact talking about implementation of monitoring system for
ES then the following are my 2 cents:
regarding how to poll data from ES nodes, I would not implement it as a
"plugin" or anything similar that is dependent on ES platform itself. It
should be an external process and possibly very light process (Java process
is not light). The impact on ES should be as minimal as possible and it
should not rely on plugin system of ES either. I would consider looking at
something like RRD tools (and extensions on top of it). Really lightweight
process that sits on each node, kicks once a while, grabs all metrics via
REST API and stores it into local storage. It should then try to send these
data into central store but it may not be always available thus storing it
locally is important in case the connection to central store is restored at
later time. The other argument why not to use plugin system of ES is that
you should be able to upgrade monitoring system independently on ES cluster
you can not do this with ES plugins.
should it poll the same stats from every single node? I think it should!
If ES cluster gets into trouble is can be because not every part of the
cluster sees "the same picture". IMO the only way how to learn about this
is collecting individual "pictures from every node". It is also the most
simple way of doing it (and you should definitely aim for simplicity).
Solving duplicities on the central storage side later should be possible
but you might not even need to do it if the size of collected data is not a
problem for you. The part where you need to be very careful when polling
the same stats from each node is making sure this is not putting high load
on the ES cluster. If it is possible to poll from the local node only (like
using _local in URL) then opt for it. Though I am no sure if this option is
still available post 1.0.0.RC1 release.
You must deploy plugins to all nodes, so you will still have the plugin
available when nodes come and go. You could add an action so that you can
turn on/off a plugin explicitly by remote command. Or you can define a
condition so a plugin can decide when to run. For example if you want a
singleton, you can check if you run on the master node, and execute only
then.
if you want to kick start some process like every 2 seconds and all it does
is fire couple of HTTP requests and store responses to local file then
using Java for this might have higher cost compared to existing
alternatives. That is all I meant by "heavy" java. If such job can finish
in terms of millis then you can not even fully benefit from JVM goodies.
If, however, you mean long running agent then Java is valid option. JBoss
RHQ is a good example of monitoring implemented in Java (as you surely know
:-)).
Regards,
Lukáš
Dne 22.1.2014 6:41 "Roy Russo" royrusso@gmail.com napsal(a):
Hi Lukas,
Somewhat... The idea is a monitoring 'service'. Hosted. Where its hosted
is a separate conversation.
I agree with most everything you stated. Having a local agent running is
the way to go. I'm avoiding remote polling and little to no advantage in
distributing as a plugin. The one advantage is that it is very clean and
simple to install, but such a tight dependency has its downsides, as you
mentioned.
I don't agree on not using java. Folks using elasticsearch already have a
JVM installed obviously, so its an easy jar drop and run. I'm not sure why
you would consider it a heavy process. Have you been looking at my old
jboss code? rrd is lighter. No objection there. So is Perl.
At first the data from elasticsearch is meant to be condensed and sent
across the wire... Small payloads. At some point, I may move to storing the
full blobs, but that can task a system and certainly add to hardware costs
for retention. Across x,000 clusters, storing monitoring metrics form
histograms and analysis can get heavy.
if you are in fact talking about implementation of monitoring system for
ES then the following are my 2 cents:
regarding how to poll data from ES nodes, I would not implement it as a
"plugin" or anything similar that is dependent on ES platform itself. It
should be an external process and possibly very light process (Java process
is not light). The impact on ES should be as minimal as possible and it
should not rely on plugin system of ES either. I would consider looking at
something like RRD tools (and extensions on top of it). Really lightweight
process that sits on each node, kicks once a while, grabs all metrics via
REST API and stores it into local storage. It should then try to send these
data into central store but it may not be always available thus storing it
locally is important in case the connection to central store is restored at
later time. The other argument why not to use plugin system of ES is that
you should be able to upgrade monitoring system independently on ES cluster
you can not do this with ES plugins.
should it poll the same stats from every single node? I think it
should! If ES cluster gets into trouble is can be because not every part of
the cluster sees "the same picture". IMO the only way how to learn about
this is collecting individual "pictures from every node". It is also the
most simple way of doing it (and you should definitely aim for simplicity).
Solving duplicities on the central storage side later should be possible
but you might not even need to do it if the size of collected data is not a
problem for you. The part where you need to be very careful when polling
the same stats from each node is making sure this is not putting high load
on the ES cluster. If it is possible to poll from the local node only (like
using _local in URL) then opt for it. Though I am no sure if this option is
still available post 1.0.0.RC1 release.
You must deploy plugins to all nodes, so you will still have the plugin
available when nodes come and go. You could add an action so that you can
turn on/off a plugin explicitly by remote command. Or you can define a
condition so a plugin can decide when to run. For example if you want a
singleton, you can check if you run on the master node, and execute only
then.
Roy, from what I understand, you want plugins that are somehow coordinated
and do not require to get installed on every node.
A similar situation is possible in the HTTP area. Some ES nodes may provide
HTTP, some not, by disabling HTTP.
The deploy of a set of coordinated plugins is possible by adding a
coordination service within a "uber plugin" architecture where a group of
node dynamically serve custom requests within a cluster. The idea is to
deploy an uber plugin once to several nodes, not all. From then on, they
accept REST commands for distributed job executing on the nodes that can
serve the job type. Each node can examine the cluster nodes and decide
where to forward the job to. Nodes that have no plugin installed - or
having the plugin disabled - are not affected.
With a trick, the uber plugin nodes can receive additional jars live, for
serving different job types.
This can be useful for monitoring services, but also for scalable rivers.
All the ES admin have to do is to select the appropriate nodes for the uber
plugin.
I started his work can called it gatherer plugin, see
Simple idea. No magic. I like to delude myself in to thinking its the next
phase for ElasticHQ.
Lukas has a point in that distribution as a plugin is not ideal, as a
monitoring agent should not be affected by the system it is monitoring -
not intimately tied.
Lukas,
Didn't know of RHQ. I was there for JBossON.
On Tuesday, January 21, 2014 1:57:18 PM UTC-5, Roy Russo wrote:
Hello all,
Curious here on what is the recommended way of plugin deployment in a
cluster, for redundancy on that plugin. Say I have 10 nodes... I deploy a
plugin on Node1. Node1 dies. Now what? I may have a new master node, but my
plugin is gone.
Is there an idea of deploying plugins on every node, and having only the
elected master be the "active" plugin?
Excuse me if it's a dumb question. I can't seem to find the answer
anywhere.
Sounds a bit like similar approach to Sematext SPM to me (except they use
CollectD - which is based on RRD is I am not mistaken) - but that shouldn't
stop you.
Simple idea. No magic. I like to delude myself in to thinking its the next
phase for ElasticHQ.
Lukas has a point in that distribution as a plugin is not ideal, as a
monitoring agent should not be affected by the system it is monitoring -
not intimately tied.
Lukas,
Didn't know of RHQ. I was there for JBossON.
On Tuesday, January 21, 2014 1:57:18 PM UTC-5, Roy Russo wrote:
Hello all,
Curious here on what is the recommended way of plugin deployment in a
cluster, for redundancy on that plugin. Say I have 10 nodes... I deploy a
plugin on Node1. Node1 dies. Now what? I may have a new master node, but my
plugin is gone.
Is there an idea of deploying plugins on every node, and having only the
elected master be the "active" plugin?
Excuse me if it's a dumb question. I can't seem to find the answer
anywhere.
Yes, very similar approach in tech, but different in distribution model.
The use of RRD makes sense for them, as I believe their offering monitors
more than just ES clusters. It's a New Relic type of model. Alternatively,
New Relic has ES monitoring plugins available as well.
On Wednesday, January 22, 2014 12:00:11 PM UTC-5, Lukáš Vlček wrote:
Roy,
Sounds a bit like similar approach to Sematext SPM to me (except they use
CollectD - which is based on RRD is I am not mistaken) - but that shouldn't
stop you.
As for RHQ it is upstream for JON (JBossON). See Blogs - JBoss.org
I think it might have been renamed (this happens in JBoss world :-)).
Yes, very similar approach in tech, but different in distribution model.
The use of RRD makes sense for them, as I believe their offering monitors
more than just ES clusters. It's a New Relic type of model. Alternatively,
New Relic has ES monitoring plugins available as well.
On Wednesday, January 22, 2014 12:00:11 PM UTC-5, Lukáš Vlček wrote:
Roy,
Sounds a bit like similar approach to Sematext SPM to me (except they use
CollectD - which is based on RRD is I am not mistaken) - but that shouldn't
stop you.
As for RHQ it is upstream for JON (JBossON). See http://planet.jboss.org/
post/jboss_operations_network_jbosson_jon_rhq_whats_in_a_name
I think it might have been renamed (this happens in JBoss world :-)).
another simple solution might be to simply have the plugin run every n
seconds/minutes and check if the node it runs on is currently the master
node. This ensure, that data is collected only on one node in your cluster
and it works in case of outages after the master node reelection... of
course you have dedicated master nodes and those never fail
In summary, it still might be a better idea to have an external process,
which can be upgraded anytime, even without a cluster outage or another
rolling upgrade.
--Alex
On Thu, Jan 23, 2014 at 5:14 PM, Ivan Brusic ivan@brusic.com wrote:
Yes, very similar approach in tech, but different in distribution model.
The use of RRD makes sense for them, as I believe their offering monitors
more than just ES clusters. It's a New Relic type of model. Alternatively,
New Relic has ES monitoring plugins available as well.
On Wednesday, January 22, 2014 12:00:11 PM UTC-5, Lukáš Vlček wrote:
Roy,
Sounds a bit like similar approach to Sematext SPM to me (except they
use CollectD - which is based on RRD is I am not mistaken) - but that
shouldn't stop you.
As for RHQ it is upstream for JON (JBossON). See Blogs - JBoss.org
jbosson_jon_rhq_whats_in_a_name
I think it might have been renamed (this happens in JBoss world :-)).
Yep. I think Jorg (or someone in this thread) suggested this approach. It's
likely the approach I will be taking at first swing with the hope of moving
later to a hybrid, where the the same core code can be used to install as
either a plugin or standalone/separated process running in a different jvm.
On Monday, January 27, 2014 4:13:32 AM UTC-5, Alexander Reelsen wrote:
Hey Roy,
another simple solution might be to simply have the plugin run every n
seconds/minutes and check if the node it runs on is currently the master
node. This ensure, that data is collected only on one node in your cluster
and it works in case of outages after the master node reelection... of
course you have dedicated master nodes and those never fail
In summary, it still might be a better idea to have an external process,
which can be upgraded anytime, even without a cluster outage or another
rolling upgrade.
--Alex
On Thu, Jan 23, 2014 at 5:14 PM, Ivan Brusic <iv...@brusic.com<javascript:>
Of course, this would just gather the data, with no reporting
capabilities.
Ivan
On Wed, Jan 22, 2014 at 9:54 AM, Roy Russo <royr...@gmail.com<javascript:>
wrote:
Lukas,
Yes, very similar approach in tech, but different in distribution model.
The use of RRD makes sense for them, as I believe their offering monitors
more than just ES clusters. It's a New Relic type of model. Alternatively,
New Relic has ES monitoring plugins available as well.
On Wednesday, January 22, 2014 12:00:11 PM UTC-5, Lukáš Vlček wrote:
Roy,
Sounds a bit like similar approach to Sematext SPM to me (except they
use CollectD - which is based on RRD is I am not mistaken) - but that
shouldn't stop you.
As for RHQ it is upstream for JON (JBossON). See Blogs - JBoss.org
jbosson_jon_rhq_whats_in_a_name
I think it might have been renamed (this happens in JBoss world :-)).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.