Specs for non-data, master-eligible nodes?

revdev_2 · January 3, 2014, 5:23pm

we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · January 3, 2014, 7:02pm

I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.

With AWS, the bigger the instance type the less likely you will have a busy
neighbor. Find an instance where you can fit everything in memory since IO
performance on the small instance types.

--
Ivan

On Fri, Jan 3, 2014 at 9:23 AM, wayne wayne@revinate.com wrote:

we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCiFXH_EiTsb_dR5oHr5XXsNqb_sQD0fTe%3DXjPYtzYbdA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

revdev_2 · January 3, 2014, 7:14pm

are you referring to non-data, non-master-eligible, e.g. client-only nodes?
i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?

On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:

I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.

With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.

--
Ivan

On Fri, Jan 3, 2014 at 9:23 AM, wayne <wa...@revinate.com <javascript:>>wrote:

we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1b834f47-ca10-4894-be69-c627491c900a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

warkolm · January 3, 2014, 9:37pm

Master only nodes can deal with queries, which is how we do it.
Our data nodes are larger with more disk but don't take part in quorum.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 4 January 2014 06:14, wayne wayne@revinate.com wrote:

are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?

On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:

I have never used non-data nodes, but in general they should primarily be
CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.

With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.

--
Ivan

On Fri, Jan 3, 2014 at 9:23 AM, wayne wa...@revinate.com wrote:

we have fairly large ES cluster and are noticing that as the cluster has
gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1b834f47-ca10-4894-be69-c627491c900a%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624a8Xv9A%3DitinU0ZrHubvtkwqjq29zsb8vntEq%3DUeJNd9A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

revdev_2 · January 3, 2014, 9:45pm

Mark, which nodes do your clients connect to? The master-only nodes or the
data nodes? (or both?) If the master-only nodes must aggregate resultsets
from data nodes, they need to have some decent amount of RAM available.
Were you able to find any guidance on how to select the proper hardware for
these nodes? should they be the same size as the data nodes (cpu + ram)?
smaller?

On Friday, January 3, 2014 1:37:31 PM UTC-8, Mark Walkom wrote:

Master only nodes can deal with queries, which is how we do it.
Our data nodes are larger with more disk but don't take part in quorum.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 4 January 2014 06:14, wayne <wa...@revinate.com <javascript:>> wrote:

are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?

On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:

I have never used non-data nodes, but in general they should primarily
be CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.

With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.

--
Ivan

On Fri, Jan 3, 2014 at 9:23 AM, wayne wa...@revinate.com wrote:

we have fairly large ES cluster and are noticing that as the cluster
has gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1b834f47-ca10-4894-be69-c627491c900a%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1e54d365-65af-427d-b660-22466f2eb3ab%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

warkolm · January 3, 2014, 9:52pm

Our end users (or clients if you want) that run the queries connect to the
masters.

We've found 4G heap/8G system is enough for what we do, with 2 CPU cores
(these are VMs). The good thing is it'll be easy for us to increase this if
needed.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 4 January 2014 08:45, wayne wayne@revinate.com wrote:

Mark, which nodes do your clients connect to? The master-only nodes or the
data nodes? (or both?) If the master-only nodes must aggregate resultsets
from data nodes, they need to have some decent amount of RAM available.
Were you able to find any guidance on how to select the proper hardware for
these nodes? should they be the same size as the data nodes (cpu + ram)?
smaller?

On Friday, January 3, 2014 1:37:31 PM UTC-8, Mark Walkom wrote:

Master only nodes can deal with queries, which is how we do it.
Our data nodes are larger with more disk but don't take part in quorum.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 4 January 2014 06:14, wayne wa...@revinate.com wrote:

are you referring to non-data, non-master-eligible, e.g. client-only
nodes? i read that a non-data, master-eligible node is only responsible for
coordination of the cluster members. do they participate in queries as well?

On Friday, January 3, 2014 11:02:03 AM UTC-8, Ivan Brusic wrote:

I have never used non-data nodes, but in general they should primarily
be CPU-bound since it is their responsibility to gather the various shard
responses from different nodes and return a unified response to the client.
It all comes down to the amount of concurrent requests and the size of your
data. Since you are using AWS, it should be simple to spin up different
types of instances since you do not have to worry about relocating data
locally. Test out different configs with your specific workload.

With AWS, the bigger the instance type the less likely you will have a
busy neighbor. Find an instance where you can fit everything in memory
since IO performance on the small instance types.

--
Ivan

On Fri, Jan 3, 2014 at 9:23 AM, wayne wa...@revinate.com wrote:

we have fairly large ES cluster and are noticing that as the cluster
has gotten larger, split brain issues have become more frequent. as per a
suggestion by kimchy, we'd like to try adding 3 non-data, master-eligible
nodes and flagging all other nodes as non-master-eligible. is there any
guidance on spec for non-data, master-eligible nodes? would a fairly small
instance, like a m1.small or micro, be sufficient to run such nodes? any
other considerations we should take into account? btw, we looked into using
the es-zk plugin, but apparently it has not been updated recently and is
not supported in 0.90.4 and higher. thanks for any tips/pointers!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/738b154b-d180-4f65-84d1-e267e8bf8ab4%40goo
glegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1b834f47-ca10-4894-be69-c627491c900a%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1e54d365-65af-427d-b660-22466f2eb3ab%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZqYd7iH82HoTxFNBowqWf_D9uEpftzHGLjDMfbjOPPvQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.