Unable to make snapshots to NFS filesystem

Hi all,

I have been struggling to put together a backup solution for my ES cluster.

As far as I understand the documentation at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

I can't understand why the following might be failing:

I have exported an NFS filesystem to both nodes of my 2-node ES cluster,
mounted as /srv/backup.

I created the elastic search user on the NFS server too and then

[root@back01 ~]# ls -ld /srv/backup/es_backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:37
/srv/backup/es_backup

Start with a clean filesystem:

[root@logdata01 ~]# rm -rf /srv/backup/*

Register the backup area:

[root@logdata01 ~]# curl -s -XPUT http://localhost:9200/_snapshot/backup -d
'{

"type": "fs",

"settings": {

"location": "/srv/backup"

}

}'

{"acknowledged":true}

Create a snapshot:

[root@logdata01 ~]# curl -XPUT
"localhost:9200/_snapshot/backup/tcom_snapshot?wait_for_completion=true&pretty"

I then get failures on various shards

Any help on how I could get this cluster into a sane state that can be
backed up greatly appreciated.

Best regards
Alex

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can you do an ls -ld /srv/backup and provide the output?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 29 September 2014 18:45, Alex Harvey alexharv074@gmail.com wrote:

Hi all,

I have been struggling to put together a backup solution for my ES cluster.

As far as I understand the documentation at

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

I can't understand why the following might be failing:

I have exported an NFS filesystem to both nodes of my 2-node ES cluster,
mounted as /srv/backup.

I created the elastic search user on the NFS server too and then

[root@back01 ~]# ls -ld /srv/backup/es_backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:37
/srv/backup/es_backup

Start with a clean filesystem:

[root@logdata01 ~]# rm -rf /srv/backup/*

Register the backup area:

[root@logdata01 ~]# curl -s -XPUT http://localhost:9200/_snapshot/backup
-d '{

"type": "fs",

"settings": {

"location": "/srv/backup"

}

}'

{"acknowledged":true}

Create a snapshot:

[root@logdata01 ~]# curl -XPUT
"localhost:9200/_snapshot/backup/tcom_snapshot?wait_for_completion=true&pretty"

I then get failures on various shards
https://gist.github.com/alexharv074/b4c7d35028c425f70f20

Any help on how I could get this cluster into a sane state that can be
backed up greatly appreciated.

Best regards
Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bCJ252AMkE434HwkB7xZVFuMxvxZQK6hZdJCgquQFdig%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for responding.

It doesn't seem to be a permissions problem -

[root@logdata01 ~]# ls -ld /srv/backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:43 /srv/backup

[root@logdata01 ~]# find /srv/backup/ ! -user elasticsearch -or ! -group
elasticsearch

[root@logdata01 ~]#

[root@logdata01 ~]# find /srv/backup -ls |head

131073 4 drwxrwx--- 3 elasticsearch elasticsearch 4096 Sep 29
18:43 /srv/backup

131076 4 drwxr-xr-x 12 elasticsearch elasticsearch 4096 Sep 29
18:37 /srv/backup/indices

131095 4 drwxr-xr-x 6 elasticsearch elasticsearch 4096 Sep 29
18:42 /srv/backup/indices/logstash-2014.09.28

131096 8 -rw-r--r-- 1 elasticsearch elasticsearch 4120 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/snapshot-tcom_snapshot

131189 4 drwxr-xr-x 2 elasticsearch elasticsearch 4096 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3

131193 8 -rw-r--r-- 1 elasticsearch elasticsearch 4443 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__3

131201 4 -rw-r--r-- 1 elasticsearch elasticsearch 689 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__b

131202 4 -rw-r--r-- 1 elasticsearch elasticsearch 61 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__c

131206 4 -rw-r--r-- 1 elasticsearch elasticsearch 281 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__g

131200 4 -rw-r--r-- 1 elasticsearch elasticsearch 349 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__a

On Monday, September 29, 2014 8:02:42 PM UTC+10, Mark Walkom wrote:

Can you do an ls -ld /srv/backup and provide the output?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 29 September 2014 18:45, Alex Harvey <alexh...@gmail.com <javascript:>>
wrote:

Hi all,

I have been struggling to put together a backup solution for my ES
cluster.

As far as I understand the documentation at

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

I can't understand why the following might be failing:

I have exported an NFS filesystem to both nodes of my 2-node ES cluster,
mounted as /srv/backup.

I created the elastic search user on the NFS server too and then

[root@back01 ~]# ls -ld /srv/backup/es_backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:37
/srv/backup/es_backup

Start with a clean filesystem:

[root@logdata01 ~]# rm -rf /srv/backup/*

Register the backup area:

[root@logdata01 ~]# curl -s -XPUT http://localhost:9200/_snapshot/backup
-d '{

"type": "fs",

"settings": {

"location": "/srv/backup"

}

}'

{"acknowledged":true}

Create a snapshot:

[root@logdata01 ~]# curl -XPUT
"localhost:9200/_snapshot/backup/tcom_snapshot?wait_for_completion=true&pretty"

I then get failures on various shards
https://gist.github.com/alexharv074/b4c7d35028c425f70f20

Any help on how I could get this cluster into a sane state that can be
backed up greatly appreciated.

Best regards
Alex

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1160a631-29b6-4242-97ec-67f446da81bb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Alex,

Any chance you have disk quota enabled for the NFS share? I see this is the
snapshot output:

"IndexShardSnapshotFailedException[[logstash-2014.09.19][4] Failed to
perform snapshot (index files)]; nested: IOException[No space left on
device]; "

Can you try copying a larger file to the NFS server as user elasticsearch?

Regards,
Ciprian Hacman

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Tuesday, September 30, 2014 3:30:08 AM UTC+3, Alex Harvey wrote:

Thanks for responding.

It doesn't seem to be a permissions problem -

[root@logdata01 ~]# ls -ld /srv/backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:43 /srv/backup

[root@logdata01 ~]# find /srv/backup/ ! -user elasticsearch -or ! -group
elasticsearch

[root@logdata01 ~]#

[root@logdata01 ~]# find /srv/backup -ls |head

131073 4 drwxrwx--- 3 elasticsearch elasticsearch 4096 Sep 29
18:43 /srv/backup

131076 4 drwxr-xr-x 12 elasticsearch elasticsearch 4096 Sep 29
18:37 /srv/backup/indices

131095 4 drwxr-xr-x 6 elasticsearch elasticsearch 4096 Sep 29
18:42 /srv/backup/indices/logstash-2014.09.28

131096 8 -rw-r--r-- 1 elasticsearch elasticsearch 4120 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/snapshot-tcom_snapshot

131189 4 drwxr-xr-x 2 elasticsearch elasticsearch 4096 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3

131193 8 -rw-r--r-- 1 elasticsearch elasticsearch 4443 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__3

131201 4 -rw-r--r-- 1 elasticsearch elasticsearch 689 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__b

131202 4 -rw-r--r-- 1 elasticsearch elasticsearch 61 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__c

131206 4 -rw-r--r-- 1 elasticsearch elasticsearch 281 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__g

131200 4 -rw-r--r-- 1 elasticsearch elasticsearch 349 Sep 29
18:37 /srv/backup/indices/logstash-2014.09.28/3/__a

On Monday, September 29, 2014 8:02:42 PM UTC+10, Mark Walkom wrote:

Can you do an ls -ld /srv/backup and provide the output?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 29 September 2014 18:45, Alex Harvey alexh...@gmail.com wrote:

Hi all,

I have been struggling to put together a backup solution for my ES
cluster.

As far as I understand the documentation at

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

I can't understand why the following might be failing:

I have exported an NFS filesystem to both nodes of my 2-node ES cluster,
mounted as /srv/backup.

I created the elastic search user on the NFS server too and then

[root@back01 ~]# ls -ld /srv/backup/es_backup

drwxrwx---. 3 elasticsearch elasticsearch 4096 Sep 29 18:37
/srv/backup/es_backup

Start with a clean filesystem:

[root@logdata01 ~]# rm -rf /srv/backup/*

Register the backup area:

[root@logdata01 ~]# curl -s -XPUT http://localhost:9200/_snapshot/backup
-d '{

"type": "fs",

"settings": {

"location": "/srv/backup"

}

}'

{"acknowledged":true}

Create a snapshot:

[root@logdata01 ~]# curl -XPUT
"localhost:9200/_snapshot/backup/tcom_snapshot?wait_for_completion=true&pretty"

I then get failures on various shards
https://gist.github.com/alexharv074/b4c7d35028c425f70f20

Any help on how I could get this cluster into a sane state that can be
backed up greatly appreciated.

Best regards
Alex

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ec09f9bd-5075-4bd5-adf7-d88cbb636c1f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c968bd02-6dd4-4c38-ae59-0b2f69b1ea32%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ciprian,

Thanks for your input - I had indeed missed that disk space failure and it
turns out I was hitting an intermittent disk space issue.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/02b41de5-478e-4083-b5d2-c9b493f24732%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.