Backing Up ES

Eugene_Strokin · March 13, 2012, 3:24am

Hello, I'm planning to use very nice script by Karussell

gist.github.com

https://gist.github.com/karussell/1074906

backup.sh

# TO_FOLDER=/something
# FROM=/your-es-installation

DATE=`date +%Y-%m-%d_%H-%M`
TO=$TO_FOLDER/$DATE/
echo "rsync from $FROM to $TO"
# the first times rsync can take a bit long - do not disable flusing
rsync -a $FROM $TO

# now disable flushing and do one manual flushing

This file has been truncated. show original

es-flush-disable.sh

# true or false
DISABLE=$1
curl -XPUT 'localhost:9200/_settings' -d '{
   "index" : {
      "translog.disable_flush" : "'$DISABLE'"
   }
}'

es-flush.sh

curl -XPOST 'localhost:9200/_flush'

There are more than three files. show original

for backing up my ES.
But I have a cluster of currently 2 machines running the ES.
I want to make sure, that this is enough to copy the files just from one
machine, because after flush they are the same on both server boxes.
And to restore, I just need to restore files on one machine, run ES, and on
another I'd just run empty ES server with my custom mappings, and the
second machine would pick the data up from the first.
Am I correct, or my assumption is mistaken?

Thank you,
Eugene S.

kimchy · March 14, 2012, 12:05pm

Yes, assuming you have 1 replica, and 2 machines, you only need to copy one machine over. And the restore process you mentioned is good.

On Tuesday, March 13, 2012 at 5:24 AM, Eugene Strokin wrote:

Hello, I'm planning to use very nice script by Karussell
Backup ElasticSearch with rsync · GitHub
for backing up my ES.

But I have a cluster of currently 2 machines running the ES.
I want to make sure, that this is enough to copy the files just from one machine, because after flush they are the same on both server boxes.
And to restore, I just need to restore files on one machine, run ES, and on another I'd just run empty ES server with my custom mappings, and the second machine would pick the data up from the first.
Am I correct, or my assumption is mistaken?

Thank you,
Eugene S.

Barsk · March 19, 2012, 9:13am

Regarding backup. Since I am using ES sort of as a noSQL database it
will not be possible to reindex from scratch after updates has been done
to the documents.
Is it guaranteed that it will always be possible to import an index to a
newer version of ES without need of reindexing it?

Second question. How do you backup shards that are not on the local
node? I am not there yet, but I might be in future...

/Kristian

Shay Banon skrev 2012-03-14 13:05:

Yes, assuming you have 1 replica, and 2 machines, you only need to
copy one machine over. And the restore process you mentioned is good.

On Tuesday, March 13, 2012 at 5:24 AM, Eugene Strokin wrote:

Hello, I'm planning to use very nice script by Karussell
Backup ElasticSearch with rsync · GitHub
for backing up my ES.
But I have a cluster of currently 2 machines running the ES.
I want to make sure, that this is enough to copy the files just from
one machine, because after flush they are the same on both server boxes.
And to restore, I just need to restore files on one machine, run ES,
and on another I'd just run empty ES server with my custom mappings,
and the second machine would pick the data up from the first.
Am I correct, or my assumption is mistaken?

Thank you,
Eugene S.

--
Med vÃ¤nlig hÃ¤lsning
Kristian JÃ¶rg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

kimchy · March 20, 2012, 10:41am

Yes, you will always be able to upgrade to a newer version without needing
to reindex. A multi node cluster backup simplest option is to backup each
node data location.

On Mon, Mar 19, 2012 at 11:13 AM, Kristian Jörg krjg@devo.se wrote:

Regarding backup. Since I am using ES sort of as a noSQL database it will
not be possible to reindex from scratch after updates has been done to the
documents.
Is it guaranteed that it will always be possible to import an index to a
newer version of ES without need of reindexing it?

Second question. How do you backup shards that are not on the local node?
I am not there yet, but I might be in future...

/Kristian

Shay Banon skrev 2012-03-14 13:05:

Yes, assuming you have 1 replica, and 2 machines, you only need to copy

one machine over. And the restore process you mentioned is good.

On Tuesday, March 13, 2012 at 5:24 AM, Eugene Strokin wrote:

Hello, I'm planning to use very nice script by Karussell

https://gist.github.com/**1074906 https://gist.github.com/1074906
for backing up my ES.
But I have a cluster of currently 2 machines running the ES.
I want to make sure, that this is enough to copy the files just from one
machine, because after flush they are the same on both server boxes.
And to restore, I just need to restore files on one machine, run ES, and
on another I'd just run empty ES server with my custom mappings, and the
second machine would pick the data up from the first.
Am I correct, or my assumption is mistaken?

Thank you,
Eugene S.

--
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

Barsk · March 20, 2012, 1:52pm

Ok, but do one still need to temporarily disable flush as the script
does, on each node,  prior to backup? Or is the setting "global" for
the cluster?

-- 
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

Shay Banon skrev 2012-03-20 11:41:
<blockquote cite="mid:CALzs+uxjRFMN4LwPWkTxDhyF1KtGVnNuWQdP_EGmOTfurs3aSw@mail.gmail.com" type="cite">Yes, you will always be able to upgrade to a newer
    version without needing to reindex. A multi node cluster backup
    simplest option is to backup each node data location.

On Mon, Mar 19, 2012 at 11:13 AM,
Kristian Jörg <krjg@devo.se>
wrote:

Regarding backup. Since I am using ES sort of as a noSQL database it will not be possible to reindex from scratch after updates has been done to the documents.

        Is it guaranteed that it will always be possible to import
        an index to a newer version of ES without need of reindexing
        it?




        Second question. How do you backup shards that are not on
        the local node? I am not there yet, but I might be in
        future...




        /Kristian




        Shay Banon skrev 2012-03-14 13:05:

Yes, assuming you have 1 replica, and 2 machines, you only need to copy one machine over. And the restore process you mentioned is good.

              On Tuesday, March 13, 2012 at 5:24 AM, Eugene Strokin
              wrote:

Hello, I'm planning to use very nice script by Karussell

https://gist.github.com/1074906

                for backing up my ES.


                But I have a cluster of currently 2 machines running
                the ES.


                I want to make sure, that this is enough to copy the
                files just from one machine, because after flush
                they are the same on both server boxes.


                And to restore, I just need to restore files on one
                machine, run ES, and on another I'd just run empty
                ES server with my custom mappings, and the second
                machine would pick the data up from the first.


                Am I correct, or my assumption is mistaken?




                Thank you,


                Eugene S.

            -- 


            Med vänlig hälsning


            Kristian Jörg




            Devo IT AB


            Tel: 054 - 22 14 58, 0709 - 15 83 42


            E-post: <a moz-do-not-send="true" href="mailto:kristian.jorg@devo.se" target="_blank">kristian.jorg@devo.se</a>


            Webb: <a moz-do-not-send="true" href="http://www.devo.se" target="_blank">http://www.devo.se</a>

kimchy · March 20, 2012, 8:17pm

In the script, the update setting is global and applied to all indices. So
you need to execute it once and it will apply to the whole cluster.

On Tue, Mar 20, 2012 at 3:52 PM, Kristian Jörg krjg@devo.se wrote:

Ok, but do one still need to temporarily disable flush as the script
does, on each node, prior to backup? Or is the setting "global" for the
cluster?

--
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

Shay Banon skrev 2012-03-20 11:41:

Yes, you will always be able to upgrade to a newer version without needing
to reindex. A multi node cluster backup simplest option is to backup each
node data location.

On Mon, Mar 19, 2012 at 11:13 AM, Kristian Jörg krjg@devo.se wrote:

Regarding backup. Since I am using ES sort of as a noSQL database it will
not be possible to reindex from scratch after updates has been done to the
documents.
Is it guaranteed that it will always be possible to import an index to a
newer version of ES without need of reindexing it?

Second question. How do you backup shards that are not on the local node?
I am not there yet, but I might be in future...

/Kristian

Shay Banon skrev 2012-03-14 13:05:

Yes, assuming you have 1 replica, and 2 machines, you only need to copy

one machine over. And the restore process you mentioned is good.

On Tuesday, March 13, 2012 at 5:24 AM, Eugene Strokin wrote:

Hello, I'm planning to use very nice script by Karussell

Backup ElasticSearch with rsync · GitHub
for backing up my ES.
But I have a cluster of currently 2 machines running the ES.
I want to make sure, that this is enough to copy the files just from
one machine, because after flush they are the same on both server boxes.
And to restore, I just need to restore files on one machine, run ES,
and on another I'd just run empty ES server with my custom mappings, and
the second machine would pick the data up from the first.
Am I correct, or my assumption is mistaken?

Thank you,
Eugene S.

--
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

Barsk · March 21, 2012, 7:10am

Yes, that is what I thought. 




Thanx for all help!


/Kristian






Shay Banon skrev 2012-03-20 21:17:
<blockquote cite="mid:CALzs+uwB6U+fGP=hzd8B+KZnPX9yUwmHNPU9tTz3RLx4HDHcOg@mail.gmail.com" type="cite">In the script, the update setting is global and
    applied to all indices. So you need to execute it once and it
    will apply to the whole cluster.

On Tue, Mar 20, 2012 at 3:52 PM,
Kristian Jörg <krjg@devo.se>
wrote:

Ok, but do one still need to temporarily disable flush as the script does, on each node, prior to backup? Or is the setting "global" for the cluster?

-- 
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

          Shay Banon skrev 2012-03-20 11:41:
          <blockquote type="cite">Yes, you will always be able to
                  upgrade to a newer version without needing to
                  reindex. A multi node cluster backup simplest
                  option is to backup each node data location.

On Mon, Mar 19, 2012 at
11:13 AM, Kristian Jörg <krjg@devo.se>
wrote:

Regarding backup. Since I am using ES sort of as a noSQL database it will not be possible to reindex from scratch after updates has been done to the documents.

                      Is it guaranteed that it will always be
                      possible to import an index to a newer version
                      of ES without need of reindexing it?




                      Second question. How do you backup shards that
                      are not on the local node? I am not there yet,
                      but I might be in future...




                      /Kristian




                      Shay Banon skrev 2012-03-14 13:05:

Yes, assuming you have 1 replica, and 2 machines, you only need to copy one machine over. And the restore process you mentioned is good.

                            On Tuesday, March 13, 2012 at 5:24 AM,
                            Eugene Strokin wrote:

Hello, I'm planning to use very nice script by Karussell

https://gist.github.com/1074906

                              for backing up my ES.


                              But I have a cluster of currently 2
                              machines running the ES.


                              I want to make sure, that this is
                              enough to copy the files just from one
                              machine, because after flush they are
                              the same on both server boxes.


                              And to restore, I just need to restore
                              files on one machine, run ES, and on
                              another I'd just run empty ES server
                              with my custom mappings, and the
                              second machine would pick the data up
                              from the first.


                              Am I correct, or my assumption is
                              mistaken?




                              Thank you,


                              Eugene S.

--

                          Med vänlig hälsning


                          Kristian Jörg




                          Devo IT AB


                          Tel: 054 - 22 14 58, 0709 - 15 83 42


                          E-post: <a moz-do-not-send="true" href="mailto:kristian.jorg@devo.se" target="_blank">kristian.jorg@devo.se</a>


                          Webb: <a moz-do-not-send="true" href="http://www.devo.se" target="_blank">http://www.devo.se</a>

-- 
Med vänlig hälsning
Kristian Jörg

Devo IT AB
Tel: 054 - 22 14 58, 0709 - 15 83 42
E-post: kristian.jorg@devo.se
Webb: http://www.devo.se

Frederic · March 21, 2012, 3:21pm

Hi Kimchy,

Based on what you said, I understand that f I needed to restore backed up
data, I would only need to replace current data directory with its copy,
shutting down the cluster first I guess.

So, lets suppose I want to recover the data but one of the servers fails
and it can't run again, but I can recover the data (stored in a remote
iSCSI in this case). I will only need to replace that server with a new
one, and copy/mount the data to the new server, to always have 1 node per
data copy right?

Thanks,

On Tuesday, 20 March 2012 07:41:31 UTC-3, kimchy wrote:

Yes, you will always be able to upgrade to a newer version without needing
to reindex. A multi node cluster backup simplest option is to backup each
node data location.

kimchy · March 25, 2012, 10:18am

Yes, you will just need to have the nodes pointing to the backed up data
location of each node. Then fire back the cluster.

On Wed, Mar 21, 2012 at 5:21 PM, Frederic focampo.br@gmail.com wrote:

Hi Kimchy,

Based on what you said, I understand that f I needed to restore backed up
data, I would only need to replace current data directory with its copy,
shutting down the cluster first I guess.

So, lets suppose I want to recover the data but one of the servers fails
and it can't run again, but I can recover the data (stored in a remote
iSCSI in this case). I will only need to replace that server with a new
one, and copy/mount the data to the new server, to always have 1 node per
data copy right?

Thanks,

On Tuesday, 20 March 2012 07:41:31 UTC-3, kimchy wrote:

Yes, you will always be able to upgrade to a newer version without
needing to reindex. A multi node cluster backup simplest option is to
backup each node data location.

Topic		Replies	Views
Backup Policies for ES Elasticsearch	4	421	July 6, 2017
Backup procedure for ES nodes Elasticsearch	2	320	July 6, 2017
Disable_flush old es vs new es6 comparable feature Elasticsearch	2	431	May 1, 2018
Procedure tu backup/restore Elasticsearch	11	421	July 6, 2017
Local gateway backup coordination Elasticsearch	2	308	July 6, 2017

Backing Up ES

Related topics