Shard allocation for backup / archive


(James Dooley) #1

I am hoping to get a system set up where I can allocate a full copy of a
indexes shards to a single server for backup / archiving purposes.

My understanding is that cluster.routing.allocation.[in|ex]clude.tag or .ip
should do the trick. Basically tell all of the indexes to not use the
backup server unless I am looking to run backups on that index.
I can set the exclude value on each active index using:

curl -XPUT localhost:9200/_settings

But this has to be set every time a new index is created, and I expect this to get messy since we have multiple indexes per day.

I can set a cluster wide setting using:

curl -XPUT localhost:9200/_cluster/settings

But this seems to take this as a mandatory setting instead of a default. So for example if the cluster setting says to ignore the backup server I can not tell a individual index to include only the backup server.

Are there any alternatives or am I stuck with having to update each indexes settings every time?

--


(Marcin Dojwa) #2

Hi, I guess you can use templates here. Each new index would get settings
from the template automatically. Check
http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

Best regards.

2012/10/5 James Dooley smsldoo@gmail.com

I am hoping to get a system set up where I can allocate a full copy of a
indexes shards to a single server for backup / archiving purposes.

My understanding is that cluster.routing.allocation.[in|ex]clude.tag or
.ip should do the trick. Basically tell all of the indexes to not use the
backup server unless I am looking to run backups on that index.
I can set the exclude value on each active index using:

curl -XPUT localhost:9200/_settings

But this has to be set every time a new index is created, and I expect this to get messy since we have multiple indexes per day.

I can set a cluster wide setting using:

curl -XPUT localhost:9200/_cluster/settings

But this seems to take this as a mandatory setting instead of a default. So for example if the cluster setting says to ignore the backup server I can not tell a individual index to include only the backup server.

Are there any alternatives or am I stuck with having to update each indexes settings every time?

--

--


(James Dooley) #3

Hm, looks like that is exactly what I need. Thanks

One last question, I just noticed that when I force the index to the
backup server using allocation.include: "backup" I am unable to remove the
index.routing.allocation.include value when I am done with the value.

If I set it to "" it takes it as a literal include no nodes. I looked
around a bit, is there any way to actually remove a setting?

Example:

curl -XPUT 'localhost:9200/index/settings' -d
'{"index.routing.allocation.include.tag": "backup"}}' moves one replica
worth of shards to the backup server.

curl -XPUT 'localhost:9200/index/settings' -d
'{"index.routing.allocation.include.tag": ""}}' drops the replica from the
backup server but does not allocate them anywhere. I still have the primary
shards allocated, but no replicas.

I imagine that if I could just remove the
index.routing.allocation.include.tag setting all together it would start
working normally again.

On Friday, October 5, 2012 9:41:47 AM UTC-4, Marcin Dojwa wrote:

Hi, I guess you can use templates here. Each new index would get settings
from the template automatically. Check
http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

Best regards.

2012/10/5 James Dooley <sms...@gmail.com <javascript:>>

I am hoping to get a system set up where I can allocate a full copy of a
indexes shards to a single server for backup / archiving purposes.

My understanding is that cluster.routing.allocation.[in|ex]clude.tag or
.ip should do the trick. Basically tell all of the indexes to not use the
backup server unless I am looking to run backups on that index.
I can set the exclude value on each active index using:

curl -XPUT localhost:9200/_settings

But this has to be set every time a new index is created, and I expect this to get messy since we have multiple indexes per day.

I can set a cluster wide setting using:

curl -XPUT localhost:9200/_cluster/settings

But this seems to take this as a mandatory setting instead of a default. So for example if the cluster setting says to ignore the backup server I can not tell a individual index to include only the backup server.

Are there any alternatives or am I stuck with having to update each indexes settings every time?

--

--


(José de Zárate) #4

James
Just a small test to myself, to check if I've understood the docs correctly.
What you intend to do is:

  • tell elasticsearch to "move" a full index (i.e. all of its shards and
    replicas) to a single node by means of the "routing.allocation.include.*"
    setting
  • tell elasticsearch to exclude all the remaining indexes from that node
    you just move to previous index to, via the "routing.allocation.exclude.*"
    setting
  • make a backup of that server using any suitable external tool (tar
    -xcvf of the index file directories, by instance), knowing that the only
    data that host has is the data of the index you're backing up.
  • during that time, the index is still available for indexing/searching
    requests made to the cluster, unless you have used the "block read" and/or
    "block write" settings for that index.

Is that your strategy?? txs for the answer

On Friday, October 5, 2012 9:53:37 AM UTC-4, James Dooley wrote:

Hm, looks like that is exactly what I need. Thanks

One last question, I just noticed that when I force the index to the
backup server using allocation.include: "backup" I am unable to remove the
index.routing.allocation.include value when I am done with the value.

If I set it to "" it takes it as a literal include no nodes. I looked
around a bit, is there any way to actually remove a setting?

Example:

curl -XPUT 'localhost:9200/index/settings' -d
'{"index.routing.allocation.include.tag": "backup"}}' moves one replica
worth of shards to the backup server.

curl -XPUT 'localhost:9200/index/settings' -d
'{"index.routing.allocation.include.tag": ""}}' drops the replica from the
backup server but does not allocate them anywhere. I still have the primary
shards allocated, but no replicas.

I imagine that if I could just remove the
index.routing.allocation.include.tag setting all together it would start
working normally again.

On Friday, October 5, 2012 9:41:47 AM UTC-4, Marcin Dojwa wrote:

Hi, I guess you can use templates here. Each new index would get settings
from the template automatically. Check
http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

Best regards.

2012/10/5 James Dooley sms...@gmail.com

I am hoping to get a system set up where I can allocate a full copy of a
indexes shards to a single server for backup / archiving purposes.

My understanding is that cluster.routing.allocation.[in|ex]clude.tag or
.ip should do the trick. Basically tell all of the indexes to not use the
backup server unless I am looking to run backups on that index.
I can set the exclude value on each active index using:

curl -XPUT localhost:9200/_settings

But this has to be set every time a new index is created, and I expect this to get messy since we have multiple indexes per day.

I can set a cluster wide setting using:

curl -XPUT localhost:9200/_cluster/settings

But this seems to take this as a mandatory setting instead of a default. So for example if the cluster setting says to ignore the backup server I can not tell a individual index to include only the backup server.

Are there any alternatives or am I stuck with having to update each indexes settings every time?

--

--


(system) #5