Haproxy between clients and cluster

Chad_Kouse · May 31, 2013, 4:18am

We are trying to use HAProxy to ensure that if a node fails it is removed
from taking requests from our client. A problem we have run into with this
approach is in the case where a node has incorrectly detected a fault and
takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are randomly
going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks it's
a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · May 31, 2013, 4:47pm

More than one master means there is more than one cluster, which obviously
is not a good state to be in. Or are you referring to primary/replica
shards?

If you do have more than one cluster, than obviously no single API call
will give you the results you want since the other cluster will have a
different answer. AFAIK, you would need to use the cluster state API on
each node to determine if there is more than one master. However, I cannot
think of an way to determine the odd man out (using the API). A
master_since field would be helpful.

Cheers,

Ivan

On Thu, May 30, 2013 at 9:18 PM, Chad Kouse chad.kouse@gmail.com wrote:

We are trying to use HAProxy to ensure that if a node fails it is removed
from taking requests from our client. A problem we have run into with this
approach is in the case where a node has incorrectly detected a fault and
takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are
randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks
it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Chad_Kouse · May 31, 2013, 9:15pm

Yeah it happened last night twice where we brought a node up - it joined the cluster. Then later it was out of the cluster and had become the master of its own single-node cluster (by the same name)
—
chad

On Fri, May 31, 2013 at 12:47 PM, Ivan Brusic ivan@brusic.com wrote:

More than one master means there is more than one cluster, which obviously
is not a good state to be in. Or are you referring to primary/replica
shards?
If you do have more than one cluster, than obviously no single API call
will give you the results you want since the other cluster will have a
different answer. AFAIK, you would need to use the cluster state API on
each node to determine if there is more than one master. However, I cannot
think of an way to determine the odd man out (using the API). A
master_since field would be helpful.
Cheers,
Ivan
On Thu, May 30, 2013 at 9:18 PM, Chad Kouse chad.kouse@gmail.com wrote:

We are trying to use HAProxy to ensure that if a node fails it is removed
from taking requests from our client. A problem we have run into with this
approach is in the case where a node has incorrectly detected a fault and
takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are
randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks
it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nick_Zadrozny · May 31, 2013, 9:23pm

How many nodes are you running, how many are master-eligible, and what's
your setting for discovery.zen.minimum_master_nodes?

Given a reasonable setting for minimum master nodes, any single node which
fails to join the rest of the cluster should refuse to form its own. From
there, health checks to the cluster root URL will return a 503 error,
causing HAProxy to take it out of the rotation.

On Thu, May 30, 2013 at 9:18 PM, Chad Kouse chad.kouse@gmail.com wrote:

We are trying to use HAProxy to ensure that if a node fails it is removed
from taking requests from our client. A problem we have run into with this
approach is in the case where a node has incorrectly detected a fault and
takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are
randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks
it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Nick Zadrozny

http://websolr.com • http://bonsai.io
Hosted full-text search, with Solr and Elasticsearch.

Let's talk in real time: http://meetme.so/nz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Chad_Kouse · May 31, 2013, 9:25pm

just 2 nodes with all the defaults for zen discovery (0.20.5)

-- chad

On Friday, May 31, 2013 at 5:23 PM, Nick Zadrozny wrote:

How many nodes are you running, how many are master-eligible, and what's your setting for discovery.zen.minimum_master_nodes?

Given a reasonable setting for minimum master nodes, any single node which fails to join the rest of the cluster should refuse to form its own. From there, health checks to the cluster root URL will return a 503 error, causing HAProxy to take it out of the rotation.

On Thu, May 30, 2013 at 9:18 PM, Chad Kouse <chad.kouse@gmail.com (mailto:chad.kouse@gmail.com)> wrote:

We are trying to use HAProxy to ensure that if a node fails it is removed from taking requests from our client. A problem we have run into with this approach is in the case where a node has incorrectly detected a fault and takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
Nick Zadrozny

http://websolr.com • http://bonsai.io
Hosted full-text search, with Solr and Elasticsearch.

Let's talk in real time: http://meetme.so/nz

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · May 31, 2013, 9:41pm

You need at least 3 nodes to avoid split brains (minimum master works
only with uneven number of nodes)

Jörg

Am 31.05.13 23:25, schrieb Chad Kouse:

just 2 nodes with all the defaults for zen discovery (0.20.5)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Chad_Kouse · May 31, 2013, 9:42pm

so I'd set min master to 2 and bring up a 3rd node?

-- chad

On Friday, May 31, 2013 at 5:41 PM, Jörg Prante wrote:

You need at least 3 nodes to avoid split brains (minimum master works
only with uneven number of nodes)

Jörg

Am 31.05.13 23:25, schrieb Chad Kouse:

just 2 nodes with all the defaults for zen discovery (0.20.5)

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · May 31, 2013, 9:43pm

Yes

Jörg

Am 31.05.13 23:42, schrieb Chad Kouse:

so I'd set min master to 2 and bring up a 3rd node?

-- chad

On Friday, May 31, 2013 at 5:41 PM, Jörg Prante wrote:

You need at least 3 nodes to avoid split brains (minimum master works
only with uneven number of nodes)

Jörg

Am 31.05.13 23:25, schrieb Chad Kouse:

just 2 nodes with all the defaults for zen discovery (0.20.5)

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · May 31, 2013, 9:44pm

I would look into way the nodes lost contact between each other? Was one
server overloaded? Maybe increasing one of the timeouts is all you need.

--
Ivan

On Fri, May 31, 2013 at 2:15 PM, Chad Kouse chad.kouse@gmail.com wrote:

Yeah it happened last night twice where we brought a node up - it joined
the cluster. Then later it was out of the cluster and had become the master
of its own single-node cluster (by the same name)
—
chad

On Fri, May 31, 2013 at 12:47 PM, Ivan Brusic ivan@brusic.com wrote:

More than one master means there is more than one cluster, which
obviously is not a good state to be in. Or are you referring to
primary/replica shards?

If you do have more than one cluster, than obviously no single API call
will give you the results you want since the other cluster will have a
different answer. AFAIK, you would need to use the cluster state API on
each node to determine if there is more than one master. However, I cannot
think of an way to determine the odd man out (using the API). A
master_since field would be helpful.

Cheers,

Ivan

On Thu, May 30, 2013 at 9:18 PM, Chad Kouse chad.kouse@gmail.com wrote:

We are trying to use HAProxy to ensure that if a node fails it is
removed from taking requests from our client. A problem we have run into
with this approach is in the case where a node has incorrectly detected a
fault and takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are
randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks
it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Chad_Kouse · May 31, 2013, 9:47pm

Yeah it could have been too busy to respond to pings I suppose.. I will increase the timeouts beyond the defaults (30s I believe)

-- chad

On Friday, May 31, 2013 at 5:44 PM, Ivan Brusic wrote:

I would look into way the nodes lost contact between each other? Was one server overloaded? Maybe increasing one of the timeouts is all you need.

--
Ivan

On Fri, May 31, 2013 at 2:15 PM, Chad Kouse <chad.kouse@gmail.com (mailto:chad.kouse@gmail.com)> wrote:

Yeah it happened last night twice where we brought a node up - it joined the cluster. Then later it was out of the cluster and had become the master of its own single-node cluster (by the same name)
—
chad

On Fri, May 31, 2013 at 12:47 PM, Ivan Brusic <ivan@brusic.com (mailto:ivan@brusic.com)> wrote:

More than one master means there is more than one cluster, which obviously is not a good state to be in. Or are you referring to primary/replica shards?

If you do have more than one cluster, than obviously no single API call will give you the results you want since the other cluster will have a different answer. AFAIK, you would need to use the cluster state API on each node to determine if there is more than one master. However, I cannot think of an way to determine the odd man out (using the API). A master_since field would be helpful.

Cheers,

Ivan

On Thu, May 30, 2013 at 9:18 PM, Chad Kouse <chad.kouse@gmail.com (mailto:chad.kouse@gmail.com)> wrote:

We are trying to use HAProxy to ensure that if a node fails it is removed from taking requests from our client. A problem we have run into with this approach is in the case where a node has incorrectly detected a fault and takes over as the master.

When this happens HAProxy thinks all nodes are up, but updates are randomly going to one master or the other -- same with searches.

Is there an API we should use to detect when more than one node thinks it's a master and try to determine what node should be removed from service?

Is there a better approach here than using HAProxy?

Thanks,
--chad

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/AN1q55Hhuh0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.