Question about index.translog.disable_flush


(yantsu) #1

I'm trying to use the settings API to set index.translog.disable_flush
to true and backing up elasticsearch data afterwards.

I'm using the local gateway as configured by default.

Setting the disable_flush parameter will produce a log output telling
me that the option might be accepted by each index.

If I start some indexing operations after that, I can see filesystem
activity inside the index and translog directories of the index data.

I was expecting no filesystem activity in this place with flush
disabled.

Does that mean, translog data are written to the filesystem even if
flush is disabled?

If this is the case, backing up the whole data directory may produce
warnings by the backup software that files have been changed during
the backup process. Such warnings could be ignored trusting
elasticsearch to ignore such files during a cluster restart from
backup as well.

To avoid such warnings, it would be handy to have inclusion or
exclusion patterns in place for backing up the data directory.

Is it it possible to provide such patterns?

Thanks

Claas


(Shay Banon) #2

Disabling flush does not mean there won't be disk activity, it just means you can copy over the data safely (technically, what happens is that it will do so without issuing a Lucene commit while you copy).

On Wednesday, March 7, 2012 at 12:38 PM, yantsu wrote:

I'm trying to use the settings API to set index.translog.disable_flush
to true and backing up elasticsearch data afterwards.

I'm using the local gateway as configured by default.

Setting the disable_flush parameter will produce a log output telling
me that the option might be accepted by each index.

If I start some indexing operations after that, I can see filesystem
activity inside the index and translog directories of the index data.

I was expecting no filesystem activity in this place with flush
disabled.

Does that mean, translog data are written to the filesystem even if
flush is disabled?

If this is the case, backing up the whole data directory may produce
warnings by the backup software that files have been changed during
the backup process. Such warnings could be ignored trusting
elasticsearch to ignore such files during a cluster restart from
backup as well.

To avoid such warnings, it would be handy to have inclusion or
exclusion patterns in place for backing up the data directory.

Is it it possible to provide such patterns?

Thanks

Claas


(yantsu) #3

Thanks for the answer.

Together with the answer in this thread (https://groups.google.com/
group/elasticsearch/browse_frm/thread/14cf8d1002a76dfe#)
it brings me to the assumption that after flushing and disabling
flush, it is better and save to not to put translog files in the
backup.
They are definitly changed through indexing operations and there is a
chance they are stored in an inconsistent state in the backup.
I saw more files added to the data directory during index operations
and flush disabled, files with the extension like *.fdt, *.fdx, ...,
*.tis.

Is it enough to leave out the translog files to get a consistent state
in the backup or there are other files related to the translog better
left out as they are taken into account during a cluster restart?

Thanks

Claas
On 7 Mrz., 12:48, Shay Banon kim...@gmail.com wrote:

Disabling flush does not mean there won't be disk activity, it just means you can copy over the data safely (technically, what happens is that it will do so without issuing a Lucene commit while you copy).

On Wednesday, March 7, 2012 at 12:38 PM, yantsu wrote:

I'm trying to use the settings API to set index.translog.disable_flush
to true and backing up elasticsearch data afterwards.

I'm using the local gateway as configured by default.

Setting the disable_flush parameter will produce a log output telling
me that the option might be accepted by each index.

If I start some indexing operations after that, I can see filesystem
activity inside the index and translog directories of the index data.

I was expecting no filesystem activity in this place with flush
disabled.

Does that mean, translog data are written to the filesystem even if
flush is disabled?

If this is the case, backing up the whole data directory may produce
warnings by the backup software that files have been changed during
the backup process. Such warnings could be ignored trusting
elasticsearch to ignore such files during a cluster restart from
backup as well.

To avoid such warnings, it would be handy to have inclusion or
exclusion patterns in place for backing up the data directory.

Is it it possible to provide such patterns?

Thanks

Claas


(Shay Banon) #4

You can copy the translog files, thats fine, but you don't have to if you issue flush before. The reason more files are created in the index location is because refreshes of the index happen regularly. Its perfectly fine to just back it while its "changing", since the "committed" (flushed) state is the one that will be recovered later on.

On Wednesday, March 7, 2012 at 2:39 PM, yantsu wrote:

Thanks for the answer.

Together with the answer in this thread (https://groups.google.com/
group/elasticsearch/browse_frm/thread/14cf8d1002a76dfe#)
it brings me to the assumption that after flushing and disabling
flush, it is better and save to not to put translog files in the
backup.
They are definitly changed through indexing operations and there is a
chance they are stored in an inconsistent state in the backup.
I saw more files added to the data directory during index operations
and flush disabled, files with the extension like *.fdt, *.fdx, ...,
*.tis.

Is it enough to leave out the translog files to get a consistent state
in the backup or there are other files related to the translog better
left out as they are taken into account during a cluster restart?

Thanks

Claas
On 7 Mrz., 12:48, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Disabling flush does not mean there won't be disk activity, it just means you can copy over the data safely (technically, what happens is that it will do so without issuing a Lucene commit while you copy).

On Wednesday, March 7, 2012 at 12:38 PM, yantsu wrote:

I'm trying to use the settings API to set index.translog.disable_flush
to true and backing up elasticsearch data afterwards.

I'm using the local gateway as configured by default.

Setting the disable_flush parameter will produce a log output telling
me that the option might be accepted by each index.

If I start some indexing operations after that, I can see filesystem
activity inside the index and translog directories of the index data.

I was expecting no filesystem activity in this place with flush
disabled.

Does that mean, translog data are written to the filesystem even if
flush is disabled?

If this is the case, backing up the whole data directory may produce
warnings by the backup software that files have been changed during
the backup process. Such warnings could be ignored trusting
elasticsearch to ignore such files during a cluster restart from
backup as well.

To avoid such warnings, it would be handy to have inclusion or
exclusion patterns in place for backing up the data directory.

Is it it possible to provide such patterns?

Thanks

Claas


(system) #5