we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.
With AWS, the bigger the instance type the less likely you will have a busy
neighbor. Find an instance where you can fit everything in memory since IO
performance on the small instance types.
we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
are you referring to non-data, non-master-eligible, e.g. client-only nodes?
i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?
On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:
I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.
With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.
--
Ivan
On Fri, Jan 3, 2014 at 9:23 AM, wayne <wa...@revinate.com <javascript:>>wrote:
we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?
On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:
I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.
With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.
we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
Mark, which nodes do your clients connect to? The master-only nodes or the
data nodes? (or both?) If the master-only nodes must aggregate resultsets
from data nodes, they need to have some decent amount of RAM available.
Were you able to find any guidance on how to select the proper hardware for
these nodes? should they be the same size as the data nodes (cpu + ram)?
smaller?
On Friday, January 3, 2014 1:37:31 PM UTC-8, Mark Walkom wrote:
Master only nodes can deal with queries, which is how we do it.
Our data nodes are larger with more disk but don't take part in quorum.
On 4 January 2014 06:14, wayne <wa...@revinate.com <javascript:>> wrote:
are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?
On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:
I have never used non-data nodes, but in general they should primarily
be CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.
With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.
we have fairly large ES cluster and are noticing that as the cluster
has gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
Our end users (or clients if you want) that run the queries connect to the
masters.
We've found 4G heap/8G system is enough for what we do, with 2 CPU cores
(these are VMs). The good thing is it'll be easy for us to increase this if
needed.
Mark, which nodes do your clients connect to? The master-only nodes or the
data nodes? (or both?) If the master-only nodes must aggregate resultsets
from data nodes, they need to have some decent amount of RAM available.
Were you able to find any guidance on how to select the proper hardware for
these nodes? should they be the same size as the data nodes (cpu + ram)?
smaller?
On Friday, January 3, 2014 1:37:31 PM UTC-8, Mark Walkom wrote:
Master only nodes can deal with queries, which is how we do it.
Our data nodes are larger with more disk but don't take part in quorum.
are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?
On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:
I have never used non-data nodes, but in general they should primarily
be CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.
With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.
we have fairly large ES cluster and are noticing that as the cluster
has gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1b834f47-ca10-4894-be69-c627491c900a% 40googlegroups.com.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.