Except it doesn't seem to be working as we understand it. We have a 2-node
setup, with the gateway snapshot set to fs:
gateway:
type: fs
fs:
location: /mnt/esgateway/
We then do the following:
set the disable_flushing property to true as outlined in Issue #906
set the snapshot interval to 0 (index.gateway.snapshot_interval)
However when muting the index under these settings we see via the
filesystem and via TRACE level debugging that the node responsible for the
gateway snapshot, while reporting that the settings for disable_flush and
snapshot_interval were set correctly, promptly goes and does a snapshot
muting the snapshot directory state as soon as we. This means that the
snapshot directory can't be safely copied if I understand correctly. We
were expecting once the disable_flush and disabling the snapshot and
performing a manual flush really should allow a safe copy with no
filesystem mutations before we reenable them again.
Feels like we're misinterpreting something or a bug.
As a side note, using the _setting API, it a GET only reports a small set
of values initially, and then if we modify the disable_flush and
snapshot_interval via the _settings api only then to the values start
appearing. Is there any reason that all settings values shouldn't be
displayed always?
The disable flush flag was introduced so backing up when using the local
gateway was simpler, its not really needed with shared gateway (like the fs
shared one you use), and disabling the snapshot interval should be enough
for it.
What you say is that when you set the snapshot interval to 0, a snapshot
still happens?
Regarding the settings, the one returned for the get settings API are only
the ones explicitly set, it does not return settings with "default" values.
Except it doesn't seem to be working as we understand it. We have a
2-node setup, with the gateway snapshot set to fs:
gateway:
type: fs
fs:
location: /mnt/esgateway/
We then do the following:
set the disable_flushing property to true as outlined in Issue #906
set the snapshot interval to 0 (index.gateway.snapshot_interval)
However when muting the index under these settings we see via the
filesystem and via TRACE level debugging that the node responsible for the
gateway snapshot, while reporting that the settings for disable_flush and
snapshot_interval were set correctly, promptly goes and does a snapshot
muting the snapshot directory state as soon as we. This means that the
snapshot directory can't be safely copied if I understand correctly. We
were expecting once the disable_flush and disabling the snapshot and
performing a manual flush really should allow a safe copy with no
filesystem mutations before we reenable them again.
Feels like we're misinterpreting something or a bug.
As a side note, using the _setting API, it a GET only reports a small set
of values initially, and then if we modify the disable_flush and
snapshot_interval via the _settings api only then to the values start
appearing. Is there any reason that all settings values shouldn't be
displayed always?
The disable flush flag was introduced so backing up when using the local
gateway was simpler, its not really needed with shared gateway (like the fs
shared one you use), and disabling the snapshot interval should be enough
for it.
Ok, thanks, that makes sense.
What you say is that when you set the snapshot interval to 0, a snapshot
still happens?
Yes, when set to 0, periodic/timed snapshots stop happening, but as soon as
we index something the snapshot happens (see the log gist). We tried
experimenting setting the snapshot internal to a large number, and that does work EXCEPT when resetting interval value one has to wait for the
larger interval value to complete before the new setting takes affect.
This is presumably because the thread sleeps until that larger interval
value and isn't woken up when the configuration changes.
Should I write up a bug report for this snapshot_interva=0 doesn't work?
Regarding the settings, the one returned for the get settings API are only
the ones explicitly set, it does not return settings with "default" values.
Is there any way other than looking at the docs to then interpret what a
particular setting is configured to then?
I posted it on the issue you opened, but can you try and set the
snapshot_interval setting to -1, 0 will not disable it properly.
Regarding the other problems, changing the snapshot interval should affect
it immediately, it will cancel the previous periodic task, and schedule a
new one.
The disable flush flag was introduced so backing up when using the local
gateway was simpler, its not really needed with shared gateway (like the fs
shared one you use), and disabling the snapshot interval should be enough
for it.
Ok, thanks, that makes sense.
What you say is that when you set the snapshot interval to 0, a
snapshot still happens?
Yes, when set to 0, periodic/timed snapshots stop happening, but as soon
as we index something the snapshot happens (see the log gist). We tried
experimenting setting the snapshot internal to a large number, and that does work EXCEPT when resetting interval value one has to wait for the
larger interval value to complete before the new setting takes affect.
This is presumably because the thread sleeps until that larger interval
value and isn't woken up when the configuration changes.
Should I write up a bug report for this snapshot_interva=0 doesn't work?
Regarding the settings, the one returned for the get settings API are
only the ones explicitly set, it does not return settings with "default"
values.
Is there any way other than looking at the docs to then interpret what a
particular setting is configured to then?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.