New Elasticsearch Discovery Plugin

I just start using Elasticsearch and analyze its source code.

I need to modify its discovery system by registering the nodes on a kind of
database. So I suppose that it's easier to develop a plugin instead of
modifying the source itself.
But after reading some discovery plugins like the basic one Zen disocvery,
zookeeper discovery or cloud-aws plugin, I still don't understand some
things.

  1. How to provide the discovered nodes to the elasticsearch core
    programm? I think it uses the "DiscoveryNodes" class.
  2. I saw that the different discovery classes use different elements for
    their constructors but where are they defined?
  3. I don't understand what the "ClusterState" is after reading the code
    and why it is useful.
  4. I didn't understand what the "AbstractLifecycleComponent" class
    works. It seems to be like a thread or runnable class but it isn't and just
    provides "doStart". What is it?
  5. Is the publish action from master mandatory? Because, if I use a
    database, I won't use it.

I didn't find enough documentation about the source code so a little help
would be great.

To implement this discovery plugin, I thought about a thread, which is
started be the discovery class, retrieving information from the database
periodically. But I don't know where I have to register the new nodes
discovered by this thread on the database (in the DiscoveryNodes class?).

Thanks for reading.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

  1. Yes, discovered nodes are provided by the DiscoveryNode class

  2. Can you please be more specific - what discovery class use what
    constructors? Maybe the answer is in 4.

  3. ClusterState keeps the current state of the cluster regarding nodes,
    indices, mappings, and most important the elected leading node that has the
    privilege to write the cluster state. The leading node in Elasticsearch is
    called "master".

  4. Elasticsearch uses a customized version of the Guice dependency
    injection framework. You find an introduction at
    GitHub - google/guice: Guice (pronounced 'juice') is a lightweight dependency injection framework for Java 8 and above, brought to you by Google.

  5. Yes, it is mandatory. Otherwise, nodes would not be able to receive
    cluster state updates.

Are you sure you need another database? The ClusterState is already a mini
database, the content is written in SMILE encoded JSON to disk.

Note, that if you poll information from a remote database, you have to
implement failover and recovery if contact to the database is lost. Each
node carries a node identifier, so perhaps all you want to do is saving the
node identifier in the database (for whatever reason).

Jörg

On Tue, May 13, 2014 at 3:19 PM, StarDrek59 stardrek59@gmail.com wrote:

I just start using Elasticsearch and analyze its source code.

I need to modify its discovery system by registering the nodes on a kind
of database. So I suppose that it's easier to develop a plugin instead of
modifying the source itself.
But after reading some discovery plugins like the basic one Zen disocvery,
zookeeper discovery or cloud-aws plugin, I still don't understand some
things.

  1. How to provide the discovered nodes to the elasticsearch core
    programm? I think it uses the "DiscoveryNodes" class.
  2. I saw that the different discovery classes use different elements
    for their constructors but where are they defined?
  3. I don't understand what the "ClusterState" is after reading the
    code and why it is useful.
  4. I didn't understand what the "AbstractLifecycleComponent" class
    works. It seems to be like a thread or runnable class but it isn't and just
    provides "doStart". What is it?
  5. Is the publish action from master mandatory? Because, if I use a
    database, I won't use it.

I didn't find enough documentation about the source code so a little help
would be great.

To implement this discovery plugin, I thought about a thread, which is
started be the discovery class, retrieving information from the database
periodically. But I don't know where I have to register the new nodes
discovered by this thread on the database (in the DiscoveryNodes class?).

Thanks for reading.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/41ec456c-744f-4454-8d47-89e89c7b5a93%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFaoQndZZm8FkVWdCFDXdg%3Dpvwpi6JBooJMVGy3J-Cv%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

First thanks for replying.

I'm working in a distributed environment and some services will use the
elasticsearch service. So a kind of database is available to locate the
different needed services like elasticsearch.

I work on different network so the multicast can't be used. Can the unicast
ping discovery provide a discovery in this kind of organization?
If not the idea is to replace the ping discovery by accessing my database
to get the needed information for the discovery and detect the failure on
the different nodes (The database is in charge of the heartbeat of the
services and their timeout). In this case the "publish" step is not
mandatory for me and will decrease the cmmunication between nodes, which is
the goal of this kind of discovery. But I'm afraid that it will have a
consequence on the behavior of elasticsearch.

To conclude and summarize, to do that kind of discovery, I need to
"register" the discovered nodes in the "DiscoveryNodes" class and update
the "ClusterState". Is it exact?

Another little question (I didn't read yet the documentation you provide)
but if I must implement a kind of thread, do I need to make my class
inherit from the "AbstractLifeCycleComponent" or can I just use an
implementation of the "Runnable" interface of Java?

Thanks for help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3dfa3a44-cca8-40b6-8a90-920ca609e470%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.