New Elasticsearch Discovery Plugin

StarDrek59 · May 13, 2014, 1:19pm

I just start using Elasticsearch and analyze its source code.

I need to modify its discovery system by registering the nodes on a kind of
database. So I suppose that it's easier to develop a plugin instead of
modifying the source itself.
But after reading some discovery plugins like the basic one Zen disocvery,
zookeeper discovery or cloud-aws plugin, I still don't understand some
things.

How to provide the discovered nodes to the elasticsearch core
programm? I think it uses the "DiscoveryNodes" class.
I saw that the different discovery classes use different elements for
their constructors but where are they defined?
I don't understand what the "ClusterState" is after reading the code
and why it is useful.
I didn't understand what the "AbstractLifecycleComponent" class
works. It seems to be like a thread or runnable class but it isn't and just
provides "doStart". What is it?
Is the publish action from master mandatory? Because, if I use a
database, I won't use it.

I didn't find enough documentation about the source code so a little help
would be great.

To implement this discovery plugin, I thought about a thread, which is
started be the discovery class, retrieving information from the database
periodically. But I don't know where I have to register the new nodes
discovered by this thread on the database (in the DiscoveryNodes class?).

Thanks for reading.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · May 13, 2014, 4:36pm

Yes, discovered nodes are provided by the DiscoveryNode class
Can you please be more specific - what discovery class use what
constructors? Maybe the answer is in 4.
ClusterState keeps the current state of the cluster regarding nodes,
indices, mappings, and most important the elected leading node that has the
privilege to write the cluster state. The leading node in Elasticsearch is
called "master".
Elasticsearch uses a customized version of the Guice dependency
injection framework. You find an introduction at
GitHub - google/guice: Guice (pronounced 'juice') is a lightweight dependency injection framework for Java 8 and above, brought to you by Google.
Yes, it is mandatory. Otherwise, nodes would not be able to receive
cluster state updates.

Are you sure you need another database? The ClusterState is already a mini
database, the content is written in SMILE encoded JSON to disk.

Note, that if you poll information from a remote database, you have to
implement failover and recovery if contact to the database is lost. Each
node carries a node identifier, so perhaps all you want to do is saving the
node identifier in the database (for whatever reason).

Jörg

On Tue, May 13, 2014 at 3:19 PM, StarDrek59 stardrek59@gmail.com wrote:

I just start using Elasticsearch and analyze its source code.

I need to modify its discovery system by registering the nodes on a kind
of database. So I suppose that it's easier to develop a plugin instead of
modifying the source itself.
But after reading some discovery plugins like the basic one Zen disocvery,
zookeeper discovery or cloud-aws plugin, I still don't understand some
things.

How to provide the discovered nodes to the elasticsearch core
programm? I think it uses the "DiscoveryNodes" class.

I saw that the different discovery classes use different elements
for their constructors but where are they defined?

I don't understand what the "ClusterState" is after reading the
code and why it is useful.

I didn't understand what the "AbstractLifecycleComponent" class
works. It seems to be like a thread or runnable class but it isn't and just
provides "doStart". What is it?

Is the publish action from master mandatory? Because, if I use a
database, I won't use it.

I didn't find enough documentation about the source code so a little help
would be great.

To implement this discovery plugin, I thought about a thread, which is
started be the discovery class, retrieving information from the database
periodically. But I don't know where I have to register the new nodes
discovered by this thread on the database (in the DiscoveryNodes class?).

Thanks for reading.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFaoQndZZm8FkVWdCFDXdg%3Dpvwpi6JBooJMVGy3J-Cv%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

StarDrek59 · May 14, 2014, 7:34am

First thanks for replying.

I'm working in a distributed environment and some services will use the
elasticsearch service. So a kind of database is available to locate the
different needed services like elasticsearch.

I work on different network so the multicast can't be used. Can the unicast
ping discovery provide a discovery in this kind of organization?
If not the idea is to replace the ping discovery by accessing my database
to get the needed information for the discovery and detect the failure on
the different nodes (The database is in charge of the heartbeat of the
services and their timeout). In this case the "publish" step is not
mandatory for me and will decrease the cmmunication between nodes, which is
the goal of this kind of discovery. But I'm afraid that it will have a
consequence on the behavior of elasticsearch.

To conclude and summarize, to do that kind of discovery, I need to
"register" the discovered nodes in the "DiscoveryNodes" class and update
the "ClusterState". Is it exact?

Another little question (I didn't read yet the documentation you provide)
but if I must implement a kind of thread, do I need to make my class
inherit from the "AbstractLifeCycleComponent" or can I just use an
implementation of the "Runnable" interface of Java?

Thanks for help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3dfa3a44-cca8-40b6-8a90-920ca609e470%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Java Plugin for Detection changes in clustering topology Elasticsearch	1	514	August 23, 2017
Can I use discovery-ec2 plugin for finding data node? Elasticsearch	2	132	November 21, 2022
Java Plugin: Data update correctly implemented? Elasticsearch	5	353	July 6, 2017
[Just Pushed]: Zen Discovery Elasticsearch	3	281	July 6, 2017
Announcement: ZooKeeper Discovery Plugin for Elasticsearch Elasticsearch	8	1860	July 6, 2017

New Elasticsearch Discovery Plugin

Related topics