stability can be one of the reasons. As your data nodes can go out of
memory if you fire up certain queries, you do not want the cluster
stability impacted by this. A strategy to circumvent this problem is to
have a couple of data and master only nodes.
But if there are issues with data nodes that would impact the user requests
anyways, correct?
On Fri, Nov 15, 2013 at 10:39 AM, Alexander Reelsen alr@spinscale.dewrote:
Hey,
stability can be one of the reasons. As your data nodes can go out of
memory if you fire up certain queries, you do not want the cluster
stability impacted by this. A strategy to circumvent this problem is to
have a couple of data and master only nodes.
A shard of an index is replicated across the data nodes (you can config the
number of replicas). In case of some data node failure, your user would not
be impacted as long there is one replica available,
But if there are issues with data nodes that would impact the user
requests anyways, correct?
On Fri, Nov 15, 2013 at 10:39 AM, Alexander Reelsen alr@spinscale.dewrote:
Hey,
stability can be one of the reasons. As your data nodes can go out of
memory if you fire up certain queries, you do not want the cluster
stability impacted by this. A strategy to circumvent this problem is to
have a couple of data and master only nodes.
Do you mean physically separated in the LAN or WAN/VPN or not running data
node service on a master?
On Nov 15, 2013 1:17 PM, "Mohit Anchlia" mohitanchlia@gmail.com wrote:
Is there any advantage of keeping Master Node separate from Data Node? I
can't think of a possible reason why one would want to keep them separate.
Do you mean physically separated in the LAN or WAN/VPN or not running data
node service on a master?
On Nov 15, 2013 1:17 PM, "Mohit Anchlia" mohitanchlia@gmail.com wrote:
Is there any advantage of keeping Master Node separate from Data Node? I
can't think of a possible reason why one would want to keep them separate.
We've found keeping them separate ( different servers /instances ) evens
out CPU usage ... The master node would get starved of CPU and drag down
the whole cluster performance ( and sometimes affected cluster health) .
not sure how would this work with multiple nodes with different settings on
the same box...
Is there any advantage of keeping Master Node separate from Data Node? I
can't think of a possible reason why one would want to keep them separate.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.