"A master in elasticsearch is responsible for handling nodes coming
and going and allocation of shards." (quoting Shay)
Now here comes my own understanding:
A master may hold data or not. A "load balancer" would be a node that
doesn't hold data (node.data: false), but has the HTTP transport
enabled. From what I know it doesn't need to be a master as well. The
idea is that client apps would send requests to it, and the load
balancer would forward the requests to the nodes having the needed
shards, and also gather the results.
Then, you would have "load balanced" nodes, if you will. Those would
be nodes that hold data, but with HTTP transport disabled
(http.enabled: false). They will only be "workhorses" and won't be
bothered with stuff like HTTP requests for clients or redirecting
requests to other data nodes.
node.master set to true means that node can be elected to become a
master in cluster.
node.data set to true means that a node will be allocated with shards
("data").
Setting both to false (or simply setting node.client to true) will cause
the node to act as a client to the cluster, potentially acting as a load
balancer.
"A master in elasticsearch is responsible for handling nodes coming
and going and allocation of shards." (quoting Shay)
Now here comes my own understanding:
A master may hold data or not. A "load balancer" would be a node that
doesn't hold data (node.data: false), but has the HTTP transport
enabled. From what I know it doesn't need to be a master as well. The
idea is that client apps would send requests to it, and the load
balancer would forward the requests to the nodes having the needed
shards, and also gather the results.
Then, you would have "load balanced" nodes, if you will. Those would
be nodes that hold data, but with HTTP transport disabled
(http.enabled: false). They will only be "workhorses" and won't be
bothered with stuff like HTTP requests for clients or redirecting
requests to other data nodes.
If I am going to start out with 2 machines would it be sensible to
set both node.master & node.data to true on both machines
put a real load balancer in front of them and just round robin between
them?
I am in ec2 so that ec2 plugin sounds appealing, i.e. if I spin up a
machine with the correct tags it will join the cluster. Not sure about the
implication of having tons of noes with both (master & data set to true
though.
Also, is there any reason not to run elasticsearch on the same machine as
apache/tomcat?
On Monday, June 25, 2012 7:11:06 AM UTC-5, kimchy wrote:
Its pretty simple:
node.master set to true means that node can be elected to become a
master in cluster.
node.data set to true means that a node will be allocated with shards
("data").
Setting both to false (or simply setting node.client to true) will cause
the node to act as a client to the cluster, potentially acting as a load
balancer.
On Sun, Jun 24, 2012 at 3:48 PM, Radu Gheorghe <radu0g...@gmail.com<javascript:>
wrote:
"A master in elasticsearch is responsible for handling nodes coming
and going and allocation of shards." (quoting Shay)
Now here comes my own understanding:
A master may hold data or not. A "load balancer" would be a node that
doesn't hold data (node.data: false), but has the HTTP transport
enabled. From what I know it doesn't need to be a master as well. The
idea is that client apps would send requests to it, and the load
balancer would forward the requests to the nodes having the needed
shards, and also gather the results.
Then, you would have "load balanced" nodes, if you will. Those would
be nodes that hold data, but with HTTP transport disabled
(http.enabled: false). They will only be "workhorses" and won't be
bothered with stuff like HTTP requests for clients or redirecting
requests to other data nodes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.