So, with the deprecation of S3 gateway, what is the current best approach to cluster persistence?

haizaar · January 20, 2013, 3:15pm

Hello,

Since S3 Gateway has been officially deprecated, how should one maintain
cluster persistence while running on Amazon cloud?

I can think of

Once in a while, flush; disable_flush on a node.
rsync index data to backup folder
sync backup folder to S3 on background, to the folder named under the
node name.
Do that for every node

That way one can have the latest snapshot of the cluster backed up to S3 up
to the latest snapshot point.
But if I have 20 nodes in the cluster, restoring it from scratch will be a
lot of manual work.

Another way I can see:

Run on EBS
Periodically flush/disable_flush on a node
sync
create EBS snapshot
enable_flush

But still

Need to take care of older snapshots pruning
Resting still looks like manual pane.

So what is the advised practice of running multi-node cluster on AWS with
ability to recover from cluster sudden death?
Can I still go with S3 gateway if I'll take particular precautions that
someone can outline?

Best regards,
Zaar

--

karmi · January 21, 2013, 8:22am

Since S3 Gateway has been officially deprecated, how should one maintain
cluster persistence while running on Amazon cloud?

EBS :), preferably IOPS

Another way I can see:

I believe you don't have to disable flush with EBS snapshots, since they
should be point-in-time snapshots. Some documentation is at [1].

So what is the advised practice of running multi-node cluster on AWS with

ability to recover from cluster sudden death?

Create EBS snapshots in intervals which make economical sense to you
(hour,day,week?). For planning your disaster recovery plan, just
hard-terminate your nodes. Then, create new EBS volumes from your
snapshots, launch new EC2 instances, attach those volumes, voila cluster
should be fine.

It's a good idea to have a process like this automated, of course. See the
Elasticsearch Chef tutorial [2] for a detailed walktrough of one
possibility.

Need to take care of older snapshots pruning

With a good library such as Fog for Ruby [3], it's really easy to have it
all automated and nifty. There are many scripts on the internet for
inspiration: automate ebs snapshots fog - Google Search

Can I still go with S3 gateway if I'll take particular precautions that
someone can outline?

The official Elasticsearch advice, and my own advice, based on personal
experience is don't do that. There are adventurous people who enjoy
some thrill, though

Karel

[1]

[2] Elasticsearch Platform — Find real-time answers at scale | Elastic
[3] fog - Getting Started

--

haizaar · January 21, 2013, 9:36am

Karel, thank you for your definite points.
The path is clear now.

Zaar

On Monday, January 21, 2013 10:22:46 AM UTC+2, Karel Minařík wrote:

Since S3 Gateway has been officially deprecated, how should one maintain

cluster persistence while running on Amazon cloud?

EBS :), preferably IOPS

Another way I can see:

I believe you don't have to disable flush with EBS snapshots, since they
should be point-in-time snapshots. Some documentation is at [1].

So what is the advised practice of running multi-node cluster on AWS with

ability to recover from cluster sudden death?

Create EBS snapshots in intervals which make economical sense to you
(hour,day,week?). For planning your disaster recovery plan, just
hard-terminate your nodes. Then, create new EBS volumes from your
snapshots, launch new EC2 instances, attach those volumes, voila cluster
should be fine.

It's a good idea to have a process like this automated, of course. See the
Elasticsearch Chef tutorial [2] for a detailed walktrough of one
possibility.

Need to take care of older snapshots pruning

With a good library such as Fog for Ruby [3], it's really easy to have it
all automated and nifty. There are many scripts on the internet for
inspiration:
automate ebs snapshots fog - Google Search

Can I still go with S3 gateway if I'll take particular precautions that
someone can outline?

The official Elasticsearch advice, and my own advice, based on personal
experience is don't do that. There are adventurous people who enjoy
some thrill, though

Karel

[1]
Amazon EBS, snapshots as incremental backups - Stack Overflow
[2]
Elasticsearch Platform — Find real-time answers at scale | Elastic
[3] fog - Getting Started

--

otisg · January 22, 2013, 5:46pm

Hi Karel,

Have you tried the "create EBS snapshot whenever, without any ES flushing
or pausing, and then recover ES from those EBS snapshots" approach yourself
or anyone you know and reeeeally trust?

Thanks,
Otis

Solr & Elasticsearch Support

On Mon, Jan 21, 2013 at 3:22 AM, Karel Minařík karel.minarik@gmail.comwrote:

Since S3 Gateway has been officially deprecated, how should one maintain

cluster persistence while running on Amazon cloud?

EBS :), preferably IOPS

Another way I can see:

I believe you don't have to disable flush with EBS snapshots, since they
should be point-in-time snapshots. Some documentation is at [1].

So what is the advised practice of running multi-node cluster on AWS with

ability to recover from cluster sudden death?

Create EBS snapshots in intervals which make economical sense to you
(hour,day,week?). For planning your disaster recovery plan, just
hard-terminate your nodes. Then, create new EBS volumes from your
snapshots, launch new EC2 instances, attach those volumes, voila cluster
should be fine.

It's a good idea to have a process like this automated, of course. See the
Elasticsearch Chef tutorial [2] for a detailed walktrough of one
possibility.

Need to take care of older snapshots pruning

With a good library such as Fog for Ruby [3], it's really easy to have it
all automated and nifty. There are many scripts on the internet for
inspiration:
automate ebs snapshots fog - Google Search

Can I still go with S3 gateway if I'll take particular precautions that
someone can outline?

The official Elasticsearch advice, and my own advice, based on personal
experience is don't do that. There are adventurous people who enjoy
some thrill, though

Karel

[1]
Amazon EBS, snapshots as incremental backups - Stack Overflow
[2]
Elasticsearch Platform — Find real-time answers at scale | Elastic
[3] fog - Getting Started

--

--

karmi · January 22, 2013, 6:45pm

I believe you don't have to disable flush with EBS snapshots, since they should be point-in-time snapshots. Some documentation is at [1].
Have you tried the "create EBS snapshot whenever, without any ES flushing or pausing, and then recover ES from those EBS snapshots" approach yourself or anyone you know and reeeeally trust?

As said, I "believe" EBS snapshots are point-in-time. I have, in fact, succesfully recovered from an EBS snapshot without previously disabling flush on a running system. But since it is hard to simulate all the variables and possibilities in play, disabling flush seems like a sane precausion to me.

Karel

--

haizaar · January 22, 2013, 8:41pm

Karel,
Do I have to also issue "flush" after "disable_flush" to make sure
that everything is in sync?

Zaar

On 22 בינו 2013, at 20:45, "Karel Minařík" karel.minarik@gmail.com wrote:

I believe you don't have to disable flush with EBS snapshots, since they should be point-in-time snapshots. Some documentation is at [1].
Have you tried the "create EBS snapshot whenever, without any ES flushing or pausing, and then recover ES from those EBS snapshots" approach yourself or anyone you know and reeeeally trust?

As said, I "believe" EBS snapshots are point-in-time. I have, in fact, succesfully recovered from an EBS snapshot without previously disabling flush on a running system. But since it is hard to simulate all the variables and possibilities in play, disabling flush seems like a sane precausion to me.

Karel

--

--

Topic		Replies	Views
Local gateway backup coordination Elasticsearch	2	308	July 6, 2017
Adding S3 gateway on a local-gateway machine Elasticsearch	2	311	July 6, 2017
S3 snapshot and restore Elasticsearch snapshot-and-restore	6	308	June 3, 2022
Moving from fs gateway type to cluster using S3/Cloudfiles Elasticsearch	7	424	July 6, 2017
Smooth migration of ES 2.3 cluster to another data center Elasticsearch migration	7	558	July 19, 2021

So, with the deprecation of S3 gateway, what is the current best approach to cluster persistence?

Thanks, Otis

Related Topics

Thanks,
Otis