Fs gateway snapshots

Diptamay · June 25, 2010, 2:38pm

The problem: The snapshotting only happens when I shut down the node
that I am running and not every 30 secs, as I would expect from the
below configuration. Did I configure something incorrectly or am I not
understand when the snapshots would take place?

The configuration, that I have setup is:

gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/meta
index:
gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/snapshot
snapshot_interval : 30
snapshot_on_close : true
memory:
enabled: true
store:
type: niofs
number_of_shards : 3
number_of_replicas : 2
path:
logs: /Users/sanyal/Documents/workspace/logs

As a background, I am prototyping ES for use in our in-house CMS
application. So right now, ES is setup on only my laptop, which is
macbook with 2.26 Ghz core 2 duo with 4GB RAM. I am also using only
one node for indexing and searching.

kimchy · June 25, 2010, 6:36pm

Hi, here is the updated configuration that should work:

gateway:
type: fs
fs:
location: /path/to/gateawy/location
index:
gateawy:
snapshot_interval: 30s
number_of_shards: 3
number_of_replicas: 2
path:
logs: /path/to/logs

Some notes regarding the configuration:

You should only define the gateway type to fs on the gateway level, the
index level will automatically be FS. Also, the path should only be defined
on the gateway level, the index level will reuse it.
The snapshot interval is defined on the index.gateway level. Note, by
default, a time_value in elasticsearch is in milliseconds, so you need to
define 30s. Also, I am surprised that you say you did not see snapshotting
happen, since the default is 10s. Note, snapshot will only happen if there
are changes.
I removed the other settings that are the default, like snapshot on
close. Note, if you do want to set it, its also on the index.gateway level.
You set the number_of_shards and number_of_replicas. This means that
these are the default values now for any index created, unless explicitly
specified in the create index API. This applies to all index level settings.

-shay.banon

On Fri, Jun 25, 2010 at 5:38 PM, diptamay diptamay@gmail.com wrote:

The problem: The snapshotting only happens when I shut down the node
that I am running and not every 30 secs, as I would expect from the
below configuration. Did I configure something incorrectly or am I not
understand when the snapshots would take place?

The configuration, that I have setup is:

gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/meta
index:
gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/snapshot
snapshot_interval : 30
snapshot_on_close : true
memory:
enabled: true
store:
type: niofs
number_of_shards : 3
number_of_replicas : 2
path:
logs: /Users/sanyal/Documents/workspace/logs

As a background, I am prototyping ES for use in our in-house CMS
application. So right now, ES is setup on only my laptop, which is
macbook with 2.26 Ghz core 2 duo with 4GB RAM. I am also using only
one node for indexing and searching.

Diptamay · June 25, 2010, 8:07pm

Hi Shay

Thanks for the configuration. I see that snapshotting, just keeps
updating the segment_N and translog files. Shouldn't the segment.gen
and *.fdx and *.fdx files be backed up as well?

Correct me if I am wrong, so if I stop and start the search server the
whole index gets rebuilt from translog? For e.g:

[15:57:17,257][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Initializing ...
[15:57:17,261][INFO ][plugins ] [Metalhead] Loaded
[15:57:18,344][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Initialized
[15:57:18,344][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Starting ...
[15:57:18,435][INFO ][transport ] [Metalhead]
bound_address[inet[/0.0.0.0:9300]], publish_address[inet[/
169.254.180.96:9300]]
[15:57:21,571][INFO ][cluster.service ] [Metalhead] New
Master [Metalhead][aa6c1c96-7b8d-4943-927c-33816a48ee9e][inet[/
169.254.180.96:9300]], Reason: zen-disco-initial_connect(master)
[15:57:21,611][INFO ][discovery ] [Metalhead]
elasticsearch/aa6c1c96-7b8d-4943-927c-33816a48ee9e
[15:57:21,635][INFO ][cluster.metadata ] [Metalhead] Creating
Index [hb_14], cause [gateway], shards [3]/[2], mappings [audio,
article, page]
[15:57:21,980][INFO ][http ] [Metalhead]
bound_address[inet[/0.0.0.0:9200]], publish_address[inet[/
169.254.180.96:9200]]
[15:57:22,236][INFO ][jmx ] [Metalhead]
bound_address[service:jmx:rmi:///jndi/rmi://:9400/jmxrmi],
publish_address[service:jmx:rmi:///jndi/rmi://169.254.180.96:9400/
jmxrmi]

If this is so, wouldn't this be a costly thing to do in production
with millions of documents?

Thanks
Diptamay

On Jun 25, 2:36 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, here is the updated configuration that should work:

gateway:
type: fs
fs:
location: /path/to/gateawy/location
index:
gateawy:
snapshot_interval: 30s
number_of_shards: 3
number_of_replicas: 2
path:
logs: /path/to/logs

Some notes regarding the configuration:

You should only define the gateway type to fs on the gateway level, the
index level will automatically be FS. Also, the path should only be defined
on the gateway level, the index level will reuse it.

The snapshot interval is defined on the index.gateway level. Note, by
default, a time_value in elasticsearch is in milliseconds, so you need to
define 30s. Also, I am surprised that you say you did not see snapshotting
happen, since the default is 10s. Note, snapshot will only happen if there
are changes.

I removed the other settings that are the default, like snapshot on
close. Note, if you do want to set it, its also on the index.gateway level.

You set the number_of_shards and number_of_replicas. This means that
these are the default values now for any index created, unless explicitly
specified in the create index API. This applies to all index level settings.

-shay.banon

On Fri, Jun 25, 2010 at 5:38 PM, diptamay dipta...@gmail.com wrote:

The problem: The snapshotting only happens when I shut down the node
that I am running and not every 30 secs, as I would expect from the
below configuration. Did I configure something incorrectly or am I not
understand when the snapshots would take place?

The configuration, that I have setup is:

gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/meta
index:
gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/snapshot
snapshot_interval : 30
snapshot_on_close : true
memory:
enabled: true
store:
type: niofs
number_of_shards : 3
number_of_replicas : 2
path:
logs: /Users/sanyal/Documents/workspace/logs

As a background, I am prototyping ES for use in our in-house CMS
application. So right now, ES is setup on only my laptop, which is
macbook with 2.26 Ghz core 2 duo with 4GB RAM. I am also using only
one node for indexing and searching.

kimchy · June 25, 2010, 8:21pm

The translog is there so a flush (on elasticsearch terms, which maps to
performing Lucene commit) will not be needed to be performed for each
operation. By default, a flush is executed after 5000 docs have been added
to the translog, in which case a commit is done, and a new translog gets
created. Until then, there are no "new" files in the index, so they don't
get snapshotted to the gateway, only the translog.

So, to your question, at the upmost, only 5000 docs will need to be
reapplied to to a recovered shard from the gateway, not all the changes
done, and this is manageable.

-shay.banon

On Fri, Jun 25, 2010 at 11:07 PM, diptamay diptamay@gmail.com wrote:

Hi Shay

Thanks for the configuration. I see that snapshotting, just keeps
updating the segment_N and translog files. Shouldn't the segment.gen
and *.fdx and *.fdx files be backed up as well?

Correct me if I am wrong, so if I stop and start the search server the
whole index gets rebuilt from translog? For e.g:

[15:57:17,257][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Initializing ...
[15:57:17,261][INFO ][plugins ] [Metalhead] Loaded
[15:57:18,344][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Initialized
[15:57:18,344][INFO ][node ] [Metalhead]
{Elasticsearch/0.8.0}[37916]: Starting ...
[15:57:18,435][INFO ][transport ] [Metalhead]
bound_address[inet[/0.0.0.0:9300]], publish_address[inet[/
169.254.180.96:9300]]
[15:57:21,571][INFO ][cluster.service ] [Metalhead] New
Master [Metalhead][aa6c1c96-7b8d-4943-927c-33816a48ee9e][inet[/
169.254.180.96:9300]], Reason: zen-disco-initial_connect(master)
[15:57:21,611][INFO ][discovery ] [Metalhead]
elasticsearch/aa6c1c96-7b8d-4943-927c-33816a48ee9e
[15:57:21,635][INFO ][cluster.metadata ] [Metalhead] Creating
Index [hb_14], cause [gateway], shards [3]/[2], mappings [audio,
article, page]
[15:57:21,980][INFO ][http ] [Metalhead]
bound_address[inet[/0.0.0.0:9200]], publish_address[inet[/
169.254.180.96:9200]]
[15:57:22,236][INFO ][jmx ] [Metalhead]
bound_address[service:jmx:rmi:///jndi/rmi://:9400/jmxrmi],
publish_address[service:jmx:rmi:///jndi/rmi://169.254.180.96:9400/
jmxrmi]

If this is so, wouldn't this be a costly thing to do in production
with millions of documents?

Thanks
Diptamay

On Jun 25, 2:36 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi, here is the updated configuration that should work:

gateway:
type: fs
fs:
location: /path/to/gateawy/location
index:
gateawy:
snapshot_interval: 30s
number_of_shards: 3
number_of_replicas: 2
path:
logs: /path/to/logs

Some notes regarding the configuration:

You should only define the gateway type to fs on the gateway level,
the
index level will automatically be FS. Also, the path should only be
defined
on the gateway level, the index level will reuse it.

The snapshot interval is defined on the index.gateway level. Note, by
default, a time_value in elasticsearch is in milliseconds, so you need to
define 30s. Also, I am surprised that you say you did not see
snapshotting
happen, since the default is 10s. Note, snapshot will only happen if
there
are changes.

I removed the other settings that are the default, like snapshot on
close. Note, if you do want to set it, its also on the index.gateway
level.

You set the number_of_shards and number_of_replicas. This means that
these are the default values now for any index created, unless explicitly
specified in the create index API. This applies to all index level
settings.

-shay.banon

On Fri, Jun 25, 2010 at 5:38 PM, diptamay dipta...@gmail.com wrote:

The problem: The snapshotting only happens when I shut down the node
that I am running and not every 30 secs, as I would expect from the
below configuration. Did I configure something incorrectly or am I not
understand when the snapshots would take place?

The configuration, that I have setup is:

gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/meta
index:
gateway:
type: fs
fs:
location: /Users/sanyal/Documents/workspace/hb_indices/snapshot
snapshot_interval : 30
snapshot_on_close : true
memory:
enabled: true
store:
type: niofs
number_of_shards : 3
number_of_replicas : 2
path:
logs: /Users/sanyal/Documents/workspace/logs

As a background, I am prototyping ES for use in our in-house CMS
application. So right now, ES is setup on only my laptop, which is
macbook with 2.26 Ghz core 2 duo with 4GB RAM. I am also using only
one node for indexing and searching.

Diptamay · June 25, 2010, 9:22pm

Hi Shay

Thanks for the info. I see the *.cfs files getting snapshotted, now
that I loaded like 100k documents.

-Diptamay

Topic		Replies	Views
Fs gateway snapshots Elasticsearch	1	258	July 6, 2017
Gateway Snapshot settings & unexpected behaviour (maybe bug?) Elasticsearch	4	356	July 6, 2017
Fs gateway parameters Elasticsearch	3	255	July 6, 2017
Configuring periodic snapshotting? Elasticsearch	5	799	July 6, 2017
Hardware snapshot best practice advice Elasticsearch	7	1794	July 5, 2017

Fs gateway snapshots

Correct me if I am wrong, so if I stop and start the search server the whole index gets rebuilt from translog? For e.g:

Correct me if I am wrong, so if I stop and start the search server the whole index gets rebuilt from translog? For e.g:

Related topics

Correct me if I am wrong, so if I stop and start the search server the
whole index gets rebuilt from translog? For e.g:

Correct me if I am wrong, so if I stop and start the search server the
whole index gets rebuilt from translog? For e.g: