Multi-master cluster


(Adam Estrada) #1

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(Adam Estrada) #2

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data nodes
but I really need to ensure that if data has stopped being collected by one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name. Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(Ivan Brusic) #3

Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com wrote:

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data nodes
but I really need to ensure that if data has stopped being collected by one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name. Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(Adam Estrada) #4

Ivan,

Thanks for the feedback. It looks like when the twitter river stops working
for any reason, the other instance(s) will not pick it up either. This is
what you mentioned. So, how do you recommend making sure the rivers never
stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic ivan@brusic.com wrote:

Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com
wrote:

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data
nodes
but I really need to ensure that if data has stopped being collected by
one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name.
Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter.
Right

now there are 2 nodes, server1 and server2. If one of these nodes
fails, I

would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I
should

note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(David Pilato) #5

I already saw that issue some months ago but with Shay, we did not find a way to solve it.
It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada estrada.adam@gmail.com a écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops working for any reason, the other instance(s) will not pick it up either. This is what you mentioned. So, how do you recommend making sure the rivers never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic ivan@brusic.com wrote:
Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com wrote:

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data nodes
but I really need to ensure that if data has stopped being collected by one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name. Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(Adam Estrada) #6

Ahh...is there any plan for a patch on that? If you point me in the right direction I can take a stab at it

A

David Pilato david@pilato.fr wrote:

I already saw that issue some months ago but with Shay, we did not find a way to solve it.
It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada estrada.adam@gmail.com a écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops working for any reason, the other instance(s) will not pick it up either. This is what you mentioned. So, how do you recommend making sure the rivers never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic ivan@brusic.com wrote:
Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com wrote:

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data nodes
but I really need to ensure that if data has stopped being collected by one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name. Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(David Pilato) #7

There's an issue about it: https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14
And here is the original Thread : http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html

What I did was to create a cron (Shell script) that look into logs and restart the node at each error. As I had 2 nodes, twitter river restarted on node 2.

That was a little workaround but not the best way to solve it!

If you need it, I can share my Shell script.

David

--

Le 1 août 2012 à 01:26, Adam Estrada estrada.adam@gmail.com a écrit :

Ahh...is there any plan for a patch on that? If you point me in the right direction I can take a stab at it

A

David Pilato david@pilato.fr wrote:

I already saw that issue some months ago but with Shay, we did not find a way to solve it.
It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada estrada.adam@gmail.com a écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops working for any reason, the other instance(s) will not pick it up either. This is what you mentioned. So, how do you recommend making sure the rivers never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic ivan@brusic.com wrote:
Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com wrote:

I guess what I am looking for is the best way to prevent failure on my
cluster. I am thinking that I can set up 2 masters and then some data nodes
but I really need to ensure that if data has stopped being collected by one
node, the other one will pick it up and run with it. I noticed yesterday
that both master nodes can't have a river on them. Or maybe its that both
masters can't have a river connected to an index with the same name. Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster is
supposed to be. I am collecting tweets in a cluster called Twitter. Right
now there are 2 nodes, server1 and server2. If one of these nodes fails, I
would like for the other one to pick up where it left off. Is that the
intended behavior or is there some other mechanism I am missing? I should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(Adam Estrada) #8

We are investigating how to fix the problem in the river code. We'll share
the fixes as they come in :wink: I would still be interested in seeing your
code too though.

Adam

On Tuesday, July 31, 2012 8:13:23 PM UTC-4, David Pilato wrote:

There's an issue about it:
https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14
And here is the original Thread :
http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html

What I did was to create a cron (Shell script) that look into logs and
restart the node at each error. As I had 2 nodes, twitter river restarted
on node 2.

That was a little workaround but not the best way to solve it!

If you need it, I can share my Shell script.

David

--

Le 1 août 2012 à 01:26, Adam Estrada estrada.adam@gmail.com a écrit :

Ahh...is there any plan for a patch on that? If you point me in the
right direction I can take a stab at it

A

David Pilato david@pilato.fr wrote:

I already saw that issue some months ago but with Shay, we did not find
a way to solve it.

It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada estrada.adam@gmail.com a
écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops
working for any reason, the other instance(s) will not pick it up either.
This is what you mentioned. So, how do you recommend making sure the rivers
never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic ivan@brusic.com wrote:
Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada estrada.adam@gmail.com
wrote:

I guess what I am looking for is the best way to prevent failure on
my

cluster. I am thinking that I can set up 2 masters and then some data
nodes

but I really need to ensure that if data has stopped being collected
by one

node, the other one will pick it up and run with it. I noticed
yesterday

that both master nodes can't have a river on them. Or maybe its that
both

masters can't have a river connected to an index with the same name.
Anyone

have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster
is

supposed to be. I am collecting tweets in a cluster called Twitter.
Right

now there are 2 nodes, server1 and server2. If one of these nodes
fails, I

would like for the other one to pick up where it left off. Is that
the

intended behavior or is there some other mechanism I am missing? I
should

note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam


(David Pilato) #9

Here is the script

#!/bin/bash

file=/usr/local/elasticsearch/elasticsearch-0.19.8/logs/es-twitter.log
if grep -q "TwitterException" $file
then
echo date +'%Y-%m-%d %H:%M:%S' ": Restarting ES" >> /home/ec2-user/es.log
/etc/rc.d/init.d/elasticsearch stop
rm $file
/etc/rc.d/init.d/elasticsearch start
fi

David

Le 1 août 2012 à 18:24, Adam Estrada estrada.adam@gmail.com a écrit :

We are investigating how to fix the problem in the river code. We'll share the
fixes as they come in :wink: I would still be interested in seeing your code too
though.

Adam

On Tuesday, July 31, 2012 8:13:23 PM UTC-4, David Pilato wrote:

There's an issue about it:
https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14
And here is the original Thread :
https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14
http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html

What I did was to create a cron (Shell script) that look into logs and
restart the node at each error. As I had 2 nodes, twitter river restarted on
node 2.

That was a little workaround but not the best way to solve it!

If you need it, I can share my Shell script.

David

--

Le 1 août 2012 à 01:26, Adam Estrada <
http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html
estrada.adam@gmail.com mailto:estrada.adam@gmail.com > a écrit :

Ahh...is there any plan for a patch on that? If you point me in the
right direction I can take a stab at it

A

David Pilato < david@pilato.fr mailto:david@pilato.fr > wrote:

I already saw that issue some months ago but with Shay, we did not
find a way to solve it.
It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada < estrada.adam@gmail.com
mailto:estrada.adam@gmail.com > a écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops
working for any reason, the other instance(s) will not pick it up
either. This is what you mentioned. So, how do you recommend making
sure the rivers never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic < ivan@brusic.com
mailto:ivan@brusic.com > wrote:
Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada <
estrada.adam@gmail.com mailto:estrada.adam@gmail.com > wrote:

I guess what I am looking for is the best way to prevent failure on
my
cluster. I am thinking that I can set up 2 masters and then some
data nodes
but I really need to ensure that if data has stopped being collected
by one
node, the other one will pick it up and run with it. I noticed
yesterday
that both master nodes can't have a river on them. Or maybe its that
both
masters can't have a river connected to an index with the same name.
Anyone
have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster
is
supposed to be. I am collecting tweets in a cluster called Twitter.
Right
now there are 2 nodes, server1 and server2. If one of these nodes
fails, I
would like for the other one to pick up where it left off. Is that
the
intended behavior or is there some other mechanism I am missing? I
should
note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


(Adam Estrada) #10

Thanks! Although I thought that you were removing the river then adding it
back again. Restarting the node can get expensive, right?

A

On Wed, Aug 1, 2012 at 1:16 PM, David Pilato david@pilato.fr wrote:

**
Here is the script

#!/bin/bash

file=/usr/local/elasticsearch/elasticsearch-0.19.8/logs/es-twitter.log
if grep -q "TwitterException" $file
then
echo date +'%Y-%m-%d %H:%M:%S' ": Restarting ES" >>
/home/ec2-user/es.log
/etc/rc.d/init.d/elasticsearch stop
rm $file
/etc/rc.d/init.d/elasticsearch start
fi

David

Le 1 août 2012 à 18:24, Adam Estrada estrada.adam@gmail.com a écrit :

We are investigating how to fix the problem in the river code. We'll
share the fixes as they come in :wink: I would still be interested in seeing
your code too though.

Adam

On Tuesday, July 31, 2012 8:13:23 PM UTC-4, David Pilato wrote:

There's an issue about it: https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14

And here is the original Thread :
https://github.com/elasticsearch/elasticsearch-river-twitter/issues/14 http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html

What I did was to create a cron (Shell script) that look into logs and
restart the node at each error. As I had 2 nodes, twitter river restarted
on node 2.

That was a little workaround but not the best way to solve it!

If you need it, I can share my Shell script.

David

--

Le 1 août 2012 à 01:26, Adam Estrada <http://elasticsearch-users.115913.n3.nabble.com/Twitter-River-stopping-after-an-arror-td3849653.html
estrada.adam@gmail.com> a écrit :

Ahh...is there any plan for a patch on that? If you point me in the
right direction I can take a stab at it

A

David Pilato < david@pilato.fr> wrote:

I already saw that issue some months ago but with Shay, we did not find
a way to solve it.

It's related to the twitter river itself, not to rivers in general.

The twitter river failed but does not restart itself.

David :wink:
Twitter : @dadoonet / @elasticsearchfr

Le 31 juil. 2012 à 23:15, Adam Estrada < estrada.adam@gmail.com> a
écrit :

Ivan,

Thanks for the feedback. It looks like when the twitter river stops
working for any reason, the other instance(s) will not pick it up either.
This is what you mentioned. So, how do you recommend making sure the rivers
never stop running?

Adam

On Tue, Jul 31, 2012 at 4:45 PM, Ivan Brusic < ivan@brusic.com>
wrote:

Rivers are run as a single instance per cluster. That is the main
benefit of utilizing a river: the indexing is done at the
cluster-level, so it can continue even with partial node failures.
That said, I have never tested how well a river responds should the
node it is running on goes down.

--
Ivan

On Tue, Jul 31, 2012 at 5:36 AM, Adam Estrada < estrada.adam@gmail.com>
wrote:

I guess what I am looking for is the best way to prevent failure on
my

cluster. I am thinking that I can set up 2 masters and then some data
nodes

but I really need to ensure that if data has stopped being collected
by one

node, the other one will pick it up and run with it. I noticed
yesterday

that both master nodes can't have a river on them. Or maybe its that
both

masters can't have a river connected to an index with the same name.
Anyone

have thoughts on this?

A

On Monday, July 30, 2012 5:33:45 PM UTC-4, Adam Estrada wrote:

I am wondering what the default behavior (see title) in my cluster
is

supposed to be. I am collecting tweets in a cluster called Twitter.
Right

now there are 2 nodes, server1 and server2. If one of these nodes
fails, I

would like for the other one to pick up where it left off. Is that
the

intended behavior or is there some other mechanism I am missing? I
should

note that the index I created has 6 shards and 2 replicas.

Thanks,
Adam

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


(system) #11