Is it possible to copy a gateway snapshot between clusters?


(Grant Rodgers) #1

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Shay Banon) #2

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers grantr@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Grant Rodgers) #3

Great! Is it also possible to copy only certain indices? We would like
to be able to copy only the metadata directory and index directories
for the indices we care about.

I guess the question is, can the gateway recovery handle missing
indices in the gateway snapshot?

Thanks,
Grant

On Aug 4, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers gra...@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Shay Banon) #4

A simpler solution is to take the metadata-XXX file (its a json file) and
remove to indices you don't want to use. It can also be hacked for example,
to change the number of replicas and start the cluster again (though there
will be an API for that).

-shay.banon

On Wed, Aug 4, 2010 at 10:56 PM, Grant Rodgers grantr@gmail.com wrote:

Great! Is it also possible to copy only certain indices? We would like
to be able to copy only the metadata directory and index directories
for the indices we care about.

I guess the question is, can the gateway recovery handle missing
indices in the gateway snapshot?

Thanks,
Grant

On Aug 4, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers gra...@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Berkay Mollamustafaoglu-2) #5

Nice !!!

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Wed, Aug 4, 2010 at 3:59 PM, Shay Banon shay.banon@elasticsearch.comwrote:

A simpler solution is to take the metadata-XXX file (its a json file) and
remove to indices you don't want to use. It can also be hacked for example,
to change the number of replicas and start the cluster again (though there
will be an API for that).

-shay.banon

On Wed, Aug 4, 2010 at 10:56 PM, Grant Rodgers grantr@gmail.com wrote:

Great! Is it also possible to copy only certain indices? We would like
to be able to copy only the metadata directory and index directories
for the indices we care about.

I guess the question is, can the gateway recovery handle missing
indices in the gateway snapshot?

Thanks,
Grant

On Aug 4, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers gra...@gmail.com
wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Lukáš Vlček) #6

Shay,

do you think you can elaborate more on this? I am surprised this is
possible. My naive understanding is that if the snapshotting is going on
then some files in gateway are changed, how it is then possible that the
copy is consistent?

Regards,
Lukas

On Wed, Aug 4, 2010 at 9:50 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers grantr@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Shay Banon) #7

It basically boils down to how lucene works when storing the index, and the
additional md5 checksum files elasticsearch produces for them. Basically, an
index "version" is written to the gateway while another one exists, and only
when its done being written to the gateway, then the "old" one is removed.
The transaction log is an append only log, and keyed by the index version,
so you just copy it over, and how many operations managed to get into it,
you will get when you recovery.

-shay.banon

On Wed, Aug 4, 2010 at 11:05 PM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

Shay,

do you think you can elaborate more on this? I am surprised this is
possible. My naive understanding is that if the snapshotting is going on
then some files in gateway are changed, how it is then possible that the
copy is consistent?

Regards,
Lukas

On Wed, Aug 4, 2010 at 9:50 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers grantr@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Grant Rodgers) #8

This is yet another example of the consistently elegant and impressive
architecture of elasticsearch. Keep up the good work Shay. I for one
am amazed by the rapid pace of development!

On Aug 4, 2:12 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

It basically boils down to how lucene works when storing the index, and the
additional md5 checksum files elasticsearch produces for them. Basically, an
index "version" is written to the gateway while another one exists, and only
when its done being written to the gateway, then the "old" one is removed.
The transaction log is an append only log, and keyed by the index version,
so you just copy it over, and how many operations managed to get into it,
you will get when you recovery.

-shay.banon

On Wed, Aug 4, 2010 at 11:05 PM, Lukáš Vlček lukas.vl...@gmail.com wrote:

Shay,

do you think you can elaborate more on this? I am surprised this is
possible. My naive understanding is that if the snapshotting is going on
then some files in gateway are changed, how it is then possible that the
copy is consistent?

Regards,
Lukas

On Wed, Aug 4, 2010 at 9:50 PM, Shay Banon shay.ba...@elasticsearch.comwrote:

Yes, you can safely copy over the gateway data, either to another s3 or
filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers gra...@gmail.com wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the copy
is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(Shay Banon) #9

thanks!

On Thu, Aug 5, 2010 at 3:03 AM, Grant Rodgers grantr@gmail.com wrote:

This is yet another example of the consistently elegant and impressive
architecture of elasticsearch. Keep up the good work Shay. I for one
am amazed by the rapid pace of development!

On Aug 4, 2:12 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

It basically boils down to how lucene works when storing the index, and
the
additional md5 checksum files elasticsearch produces for them. Basically,
an
index "version" is written to the gateway while another one exists, and
only
when its done being written to the gateway, then the "old" one is
removed.
The transaction log is an append only log, and keyed by the index
version,
so you just copy it over, and how many operations managed to get into it,
you will get when you recovery.

-shay.banon

On Wed, Aug 4, 2010 at 11:05 PM, Lukáš Vlček lukas.vl...@gmail.com
wrote:

Shay,

do you think you can elaborate more on this? I am surprised this is
possible. My naive understanding is that if the snapshotting is going
on

then some files in gateway are changed, how it is then possible that
the

copy is consistent?

Regards,
Lukas

On Wed, Aug 4, 2010 at 9:50 PM, Shay Banon <
shay.ba...@elasticsearch.com>wrote:

Yes, you can safely copy over the gateway data, either to another s3
or

filesystem, even while its snapshotting.

-shay.banon

On Wed, Aug 4, 2010 at 10:46 PM, Grant Rodgers gra...@gmail.com
wrote:

We would like to copy the gateway snapshot from a production cluster
to a development cluster. Since our production gateway is on s3,
copying indices to development could take several minutes, during
which the production cluster would still be snapshotting. Is this
safe? If not, could we disable snapshotting temporarily while the
copy

is happening?

Also, is it possible to copy an s3 gateway to a filesystem gateway?

We are running ES 0.9.

Thanks,
Grant


(system) #10