Recovery in case of Gateway set to none


(Sebastian Gavarini) #1

Hello,

I am interested in deploying ES, but I have a couple of questions
about the Gateway module that I couldn't answer with the current
documentation. First a bit of background, the site has a persistent
store in a traditional RDBMs, and the idea is to also send the
information to ES for searching.

Having already all the information for batch recovery in a DB, and
given the reasonable stability of the data center, I am considering
not replicating even once more the information in a Gateway (DB + ES
Lucene + ES Gateway). For example if the site goes totally down, and
then comes back, I can live with some inconsistent indices (meaning
some values are in one Lucene replica but not in another one, not a
corrupted index of course) but mostly everything would be ok (if you
hit one server you'll see a couple of results more than the next one,
but nothing serious in my case). What would be a problem in my case is
that in the name of consistency the recovery takes more than a couple
of minutes or the whole cluster gets blocked waiting for some shard/
index in a server that is still missing for whatever reason. Later
without the live site downtime pressure I would run a batch full
indexing, using aliases so it won't disrupt traffic.

The point of this is instead of paying an everyday price of
replication I'd pay a bit of inconsistency plus a batch full indexing
just in the case of a full cluster failure.

So, the question is, if I disable the Gateway, and a full cluster fail
happens, is it possible for ES to come back with it's local index
information (probably the cluster meta-data too) but without reloading
everything from snapshots+translog? If yes, what are the configuration
options to enable?

This is getting too long, I know, sorry for that.

Thanks a lot for all the hard work,
Sebastian.


(Shay Banon) #2

Sounds like you are looking for the new local gateway added in 0.11, I
blogged about it here:
http://www.elasticsearch.com/blog/2010/09/27/zero_conf_persistency.html.

On Wed, Sep 29, 2010 at 8:04 PM, Sebastian sgavarini@gmail.com wrote:

Hello,

I am interested in deploying ES, but I have a couple of questions
about the Gateway module that I couldn't answer with the current
documentation. First a bit of background, the site has a persistent
store in a traditional RDBMs, and the idea is to also send the
information to ES for searching.

Having already all the information for batch recovery in a DB, and
given the reasonable stability of the data center, I am considering
not replicating even once more the information in a Gateway (DB + ES
Lucene + ES Gateway). For example if the site goes totally down, and
then comes back, I can live with some inconsistent indices (meaning
some values are in one Lucene replica but not in another one, not a
corrupted index of course) but mostly everything would be ok (if you
hit one server you'll see a couple of results more than the next one,
but nothing serious in my case). What would be a problem in my case is
that in the name of consistency the recovery takes more than a couple
of minutes or the whole cluster gets blocked waiting for some shard/
index in a server that is still missing for whatever reason. Later
without the live site downtime pressure I would run a batch full
indexing, using aliases so it won't disrupt traffic.

The point of this is instead of paying an everyday price of
replication I'd pay a bit of inconsistency plus a batch full indexing
just in the case of a full cluster failure.

So, the question is, if I disable the Gateway, and a full cluster fail
happens, is it possible for ES to come back with it's local index
information (probably the cluster meta-data too) but without reloading
everything from snapshots+translog? If yes, what are the configuration
options to enable?

This is getting too long, I know, sorry for that.

Thanks a lot for all the hard work,
Sebastian.


(Sebastian Gavarini) #3

Hi Shay,

Thanks for the quick answer.

Yes, you are right, I've seen it and gave it a try. I saw it generates
the Lucene index files and a translog dir, but I didn't see a snapshot
anywhere, I added some documents, did a refresh, tried the snapshot
api, and nothing.

I read the documentation, but couldn't get an answer to this, I assume
it doesn't need a separate snapshot, right? it just uses Lucene and a
translog.

Cheers,
Sebastian

On Sep 30, 2:06 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Sounds like you are looking for the new local gateway added in 0.11, I
blogged about it here:http://www.elasticsearch.com/blog/2010/09/27/zero_conf_persistency.html.

On Wed, Sep 29, 2010 at 8:04 PM, Sebastian sgavar...@gmail.com wrote:

Hello,

I am interested in deploying ES, but I have a couple of questions
about the Gateway module that I couldn't answer with the current
documentation. First a bit of background, the site has a persistent
store in a traditional RDBMs, and the idea is to also send the
information to ES for searching.

Having already all the information for batch recovery in a DB, and
given the reasonable stability of the data center, I am considering
not replicating even once more the information in a Gateway (DB + ES
Lucene + ES Gateway). For example if the site goes totally down, and
then comes back, I can live with some inconsistent indices (meaning
some values are in one Lucene replica but not in another one, not a
corrupted index of course) but mostly everything would be ok (if you
hit one server you'll see a couple of results more than the next one,
but nothing serious in my case). What would be a problem in my case is
that in the name of consistency the recovery takes more than a couple
of minutes or the whole cluster gets blocked waiting for some shard/
index in a server that is still missing for whatever reason. Later
without the live site downtime pressure I would run a batch full
indexing, using aliases so it won't disrupt traffic.

The point of this is instead of paying an everyday price of
replication I'd pay a bit of inconsistency plus a batch full indexing
just in the case of a full cluster failure.

So, the question is, if I disable the Gateway, and a full cluster fail
happens, is it possible for ES to come back with it's local index
information (probably the cluster meta-data too) but without reloading
everything from snapshots+translog? If yes, what are the configuration
options to enable?

This is getting too long, I know, sorry for that.

Thanks a lot for all the hard work,
Sebastian.


(Shay Banon) #4

Yes, and the recovery is done from each local node data.

On Thu, Sep 30, 2010 at 7:18 AM, Sebastian sgavarini@gmail.com wrote:

Hi Shay,

Thanks for the quick answer.

Yes, you are right, I've seen it and gave it a try. I saw it generates
the Lucene index files and a translog dir, but I didn't see a snapshot
anywhere, I added some documents, did a refresh, tried the snapshot
api, and nothing.

I read the documentation, but couldn't get an answer to this, I assume
it doesn't need a separate snapshot, right? it just uses Lucene and a
translog.

Cheers,
Sebastian

On Sep 30, 2:06 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Sounds like you are looking for the new local gateway added in 0.11, I
blogged about it here:
http://www.elasticsearch.com/blog/2010/09/27/zero_conf_persistency.html.

On Wed, Sep 29, 2010 at 8:04 PM, Sebastian sgavar...@gmail.com wrote:

Hello,

I am interested in deploying ES, but I have a couple of questions
about the Gateway module that I couldn't answer with the current
documentation. First a bit of background, the site has a persistent
store in a traditional RDBMs, and the idea is to also send the
information to ES for searching.

Having already all the information for batch recovery in a DB, and
given the reasonable stability of the data center, I am considering
not replicating even once more the information in a Gateway (DB + ES
Lucene + ES Gateway). For example if the site goes totally down, and
then comes back, I can live with some inconsistent indices (meaning
some values are in one Lucene replica but not in another one, not a
corrupted index of course) but mostly everything would be ok (if you
hit one server you'll see a couple of results more than the next one,
but nothing serious in my case). What would be a problem in my case is
that in the name of consistency the recovery takes more than a couple
of minutes or the whole cluster gets blocked waiting for some shard/
index in a server that is still missing for whatever reason. Later
without the live site downtime pressure I would run a batch full
indexing, using aliases so it won't disrupt traffic.

The point of this is instead of paying an everyday price of
replication I'd pay a bit of inconsistency plus a batch full indexing
just in the case of a full cluster failure.

So, the question is, if I disable the Gateway, and a full cluster fail
happens, is it possible for ES to come back with it's local index
information (probably the cluster meta-data too) but without reloading
everything from snapshots+translog? If yes, what are the configuration
options to enable?

This is getting too long, I know, sorry for that.

Thanks a lot for all the hard work,
Sebastian.


(system) #5