S3 update perid

sirmak · January 30, 2011, 3:12pm

Hi,

what is the time period to update aws s3 data, and is it a static
period or configurable or based on activity...
my second question is, if I have a replication factor 2 (and no need
to shard, sharding factor 1), how should I set my s3 configuration. do
I need to write both ec2 instance data to s3 in seperate paths, or in
the same path. are they written as seperate files or both use the same
file chunks. my last question is what is the recommended way of
backup. eg: copying s3 chunks daily to a backup folder ?

my best
Serdar

kimchy · January 30, 2011, 7:50pm

If you are using s3, then you can control the snapshotting interval (defaults to 10 seconds). The setting for that is: index.gateway.snapshot_interval. There is also a snapshot API for that.

Note, a snapshot operation only snapshots delta changes, and not the whole data every time.

Setting it up just requires setting the s3 gateway parameters on all nodes. A cluster uses a single bucket (and inner path) for it. The local data stored on the ec2 instances are stored under the data location.

Finally, consider using local gateway with AWS, and depending on your availability requirements, you can either store them locally, or on EBS.
On Sunday, January 30, 2011 at 5:12 PM, si wrote:

Hi,

what is the time period to update aws s3 data, and is it a static
period or configurable or based on activity...
my second question is, if I have a replication factor 2 (and no need
to shard, sharding factor 1), how should I set my s3 configuration. do
I need to write both ec2 instance data to s3 in seperate paths, or in
the same path. are they written as seperate files or both use the same
file chunks. my last question is what is the recommended way of
backup. eg: copying s3 chunks daily to a backup folder ?

my best
Serdar

sirmak · January 30, 2011, 8:38pm

thanks Shay, elasticsearch is very intelligently designed, using ec2
local instance storage and snapshotting with some periods only delta
to s3 is very cost saving and rock solid when used with replicas and
also LB & HA ability, perfect...

On Jan 30, 9:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

If you are using s3, then you can control the snapshotting interval (defaults to 10 seconds). The setting for that is: index.gateway.snapshot_interval. There is also a snapshot API for that.

Note, a snapshot operation only snapshots delta changes, and not the whole data every time.

Setting it up just requires setting the s3 gateway parameters on all nodes. A cluster uses a single bucket (and inner path) for it. The local data stored on the ec2 instances are stored under the data location.

Finally, consider using local gateway with AWS, and depending on your availability requirements, you can either store them locally, or on EBS.

On Sunday, January 30, 2011 at 5:12 PM, si wrote:

Hi,

what is the time period to update aws s3 data, and is it a static
period or configurable or based on activity...
my second question is, if I have a replication factor 2 (and no need
to shard, sharding factor 1), how should I set my s3 configuration. do
I need to write both ec2 instance data to s3 in seperate paths, or in
the same path. are they written as seperate files or both use the same
file chunks. my last question is what is the recommended way of
backup. eg: copying s3 chunks daily to a backup folder ?

my best
Serdar

Topic		Replies	Views
What is the best practice for periodic snapshotting with awc-cloud+s3 Elasticsearch	8	2393	July 6, 2017
Deleting s3 gateway data Elasticsearch	6	326	July 6, 2017
Index Backups to S3? Elasticsearch	8	1719	July 6, 2017
ElasticSearch to FileSystem(EC2) snapshot is dump or sync? Elasticsearch	4	592	May 31, 2018
How to add ec2 s3 or other gateway after index is created? Elasticsearch	8	438	July 6, 2017

S3 update perid

Related topics