Cluster / Configuration / Quorum


(Nicolas Blanc) #1

Hi all,

My problem today is not so simple at all! I already have some big cluster
in production. But, as of today my number of nodes is known and didn't
change too much. Tomorrow i plan to be more and more elastic. I won't know
the exact number of nodes at a given time, as it will depends on the load
and some auto-scaling rules.

All my platform is build with Chef, so i am able to determine the number of
nodes with some tips. But i have a lot of questions:

When i add a new master node, the minimum_master_nodes value is
re-calculated accordingly... Is this value propagated to the whole cluster
when the new master node come up ? Chef adapt the configuration of all es
nodes, but don't restart them when already running. Do i need to change
this value via a call to the API
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html)
? When i remove a node, i think i have no choice but calling the API to
reduce the minimum_master_nodes, am i right ?

Another question i have, is it a good idea to use unicast and defining all
master nodes in conf files, or do i need to define just multicast ?

And finally, i want to have a cluster between 2 zones, one is my actuel DC,
and the other one if an ec2 vpc... What's the best way to interface the 2
zones ? Unicast + Multicast and EC2 + unicast ? Do i need to list each
master node of the whole cluster in each conf files ?

Thanks in advance to the one who will try to answer me.

--
Nicolas Blanc.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

Hey Nicolas,

What I would probably do in your case is to split node roles:

3 nodes with data: false, master: true
X data nodes: data: true, master: false

Note that master nodes does not require "big machines". 512 Mb RAM, spinning disks, 1 core...

No need to worry then about that setting. Just set it to 2.

What do you think?

About having 2 data centers, I would not recommend to connect directly your internal nodes with amazon unless you know that you have a very low latency Network.
I would setup 2 clusters and would push docs from service layer to both clusters.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 oct. 2013 à 17:42, Nicolas Blanc nicolas.blanc@blablacar.com a écrit :

Hi all,

My problem today is not so simple at all! I already have some big cluster in production. But, as of today my number of nodes is known and didn't change too much. Tomorrow i plan to be more and more elastic. I won't know the exact number of nodes at a given time, as it will depends on the load and some auto-scaling rules.

All my platform is build with Chef, so i am able to determine the number of nodes with some tips. But i have a lot of questions:

When i add a new master node, the minimum_master_nodes value is re-calculated accordingly... Is this value propagated to the whole cluster when the new master node come up ? Chef adapt the configuration of all es nodes, but don't restart them when already running. Do i need to change this value via a call to the API (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html) ? When i remove a node, i think i have no choice but calling the API to reduce the minimum_master_nodes, am i right ?

Another question i have, is it a good idea to use unicast and defining all master nodes in conf files, or do i need to define just multicast ?

And finally, i want to have a cluster between 2 zones, one is my actuel DC, and the other one if an ec2 vpc... What's the best way to interface the 2 zones ? Unicast + Multicast and EC2 + unicast ? Do i need to list each master node of the whole cluster in each conf files ?

Thanks in advance to the one who will try to answer me.

--
Nicolas Blanc.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Amit Soni) #3

And if I correctly remember, in version 1.0, there is a provision to take
snapshot which can solve for disaster recovery.

David - Just wondering if you have plans to support mechanism which would
enable data replication across two data centers (more like near real time
mirroring feature?

-Amit.

On Mon, Oct 21, 2013 at 1:59 PM, David Pilato david@pilato.fr wrote:

Hey Nicolas,

What I would probably do in your case is to split node roles:

3 nodes with data: false, master: true
X data nodes: data: true, master: false

Note that master nodes does not require "big machines". 512 Mb RAM,
spinning disks, 1 core...

No need to worry then about that setting. Just set it to 2.

What do you think?

About having 2 data centers, I would not recommend to connect directly
your internal nodes with amazon unless you know that you have a very low
latency Network.
I would setup 2 clusters and would push docs from service layer to both
clusters.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 oct. 2013 à 17:42, Nicolas Blanc nicolas.blanc@blablacar.com a
écrit :

Hi all,

My problem today is not so simple at all! I already have some big cluster
in production. But, as of today my number of nodes is known and didn't
change too much. Tomorrow i plan to be more and more elastic. I won't know
the exact number of nodes at a given time, as it will depends on the load
and some auto-scaling rules.

All my platform is build with Chef, so i am able to determine the number
of nodes with some tips. But i have a lot of questions:

When i add a new master node, the minimum_master_nodes value is
re-calculated accordingly... Is this value propagated to the whole cluster
when the new master node come up ? Chef adapt the configuration of all es
nodes, but don't restart them when already running. Do i need to change
this value via a call to the API (
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html)
? When i remove a node, i think i have no choice but calling the API to
reduce the minimum_master_nodes, am i right ?

Another question i have, is it a good idea to use unicast and defining all
master nodes in conf files, or do i need to define just multicast ?

And finally, i want to have a cluster between 2 zones, one is my actuel
DC, and the other one if an ec2 vpc... What's the best way to interface the
2 zones ? Unicast + Multicast and EC2 + unicast ? Do i need to list each
master node of the whole cluster in each conf files ?

Thanks in advance to the one who will try to answer me.

--
Nicolas Blanc.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #4

Yes but not in the short term.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 22 oct. 2013 à 04:30, Amit Soni amitsoni29@gmail.com a écrit :

And if I correctly remember, in version 1.0, there is a provision to take snapshot which can solve for disaster recovery.

David - Just wondering if you have plans to support mechanism which would enable data replication across two data centers (more like near real time mirroring feature?

-Amit.

On Mon, Oct 21, 2013 at 1:59 PM, David Pilato david@pilato.fr wrote:
Hey Nicolas,

What I would probably do in your case is to split node roles:

3 nodes with data: false, master: true
X data nodes: data: true, master: false

Note that master nodes does not require "big machines". 512 Mb RAM, spinning disks, 1 core...

No need to worry then about that setting. Just set it to 2.

What do you think?

About having 2 data centers, I would not recommend to connect directly your internal nodes with amazon unless you know that you have a very low latency Network.
I would setup 2 clusters and would push docs from service layer to both clusters.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 oct. 2013 à 17:42, Nicolas Blanc nicolas.blanc@blablacar.com a écrit :

Hi all,

My problem today is not so simple at all! I already have some big cluster in production. But, as of today my number of nodes is known and didn't change too much. Tomorrow i plan to be more and more elastic. I won't know the exact number of nodes at a given time, as it will depends on the load and some auto-scaling rules.

All my platform is build with Chef, so i am able to determine the number of nodes with some tips. But i have a lot of questions:

When i add a new master node, the minimum_master_nodes value is re-calculated accordingly... Is this value propagated to the whole cluster when the new master node come up ? Chef adapt the configuration of all es nodes, but don't restart them when already running. Do i need to change this value via a call to the API (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html) ? When i remove a node, i think i have no choice but calling the API to reduce the minimum_master_nodes, am i right ?

Another question i have, is it a good idea to use unicast and defining all master nodes in conf files, or do i need to define just multicast ?

And finally, i want to have a cluster between 2 zones, one is my actuel DC, and the other one if an ec2 vpc... What's the best way to interface the 2 zones ? Unicast + Multicast and EC2 + unicast ? Do i need to list each master node of the whole cluster in each conf files ?

Thanks in advance to the one who will try to answer me.

--
Nicolas Blanc.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5