So, with the deprecation of S3 gateway, what is the current best approach to cluster persistence?

Hi Karel,

Have you tried the "create EBS snapshot whenever, without any ES flushing
or pausing, and then recover ES from those EBS snapshots" approach yourself
or anyone you know and reeeeally trust? :slight_smile:

Thanks,
Otis

Solr & Elasticsearch Support

On Mon, Jan 21, 2013 at 3:22 AM, Karel Minařík karel.minarik@gmail.comwrote:

Since S3 Gateway has been officially deprecated, how should one maintain

cluster persistence while running on Amazon cloud?

EBS :), preferably IOPS

Another way I can see:

I believe you don't have to disable flush with EBS snapshots, since they
should be point-in-time snapshots. Some documentation is at [1].

So what is the advised practice of running multi-node cluster on AWS with

ability to recover from cluster sudden death?

Create EBS snapshots in intervals which make economical sense to you
(hour,day,week?). For planning your disaster recovery plan, just
hard-terminate your nodes. Then, create new EBS volumes from your
snapshots, launch new EC2 instances, attach those volumes, voila cluster
should be fine.

It's a good idea to have a process like this automated, of course. See the
Elasticsearch Chef tutorial [2] for a detailed walktrough of one
possibility.

Need to take care of older snapshots pruning

With a good library such as Fog for Ruby [3], it's really easy to have it
all automated and nifty. There are many scripts on the internet for
inspiration:
Google Search

Can I still go with S3 gateway if I'll take particular precautions that
someone can outline?

The official Elasticsearch advice, and my own advice, based on personal
experience is don't do that. There are adventurous people who enjoy
some thrill, though :slight_smile:

Karel

[1]
http://stackoverflow.com/questions/6469556/amazon-ebs-snapshots-as-incremental-backups
[2]
Elastic — The Search AI Company | Elastic
[3] http://fog.io/about/getting_started.html

--

--