Can't stop a snapshot running on my cluster

Andrew_Vos · May 24, 2014, 2:17pm

A few days ago I started a snapshot, but instead of using a shared network
I used the local filesystem. Because my root partition only had 8gb (and
this is where I stored the snapshots) the partition got filled up and three
of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2:9300]][cluster/snapshot/get]];
nested: SnapshotMissingException[[production_backup:_snapshot1] is
missing]; nested:
FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No such
file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2:9300]][cluster/snapshot/create]];
nested: ConcurrentSnapshotExecutionException[[production_backup:snapshot_1]
a snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af485e13-dc74-4e88-b6db-e3a4d67fb00c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Igor_Motov · May 24, 2014, 5:33pm

Which version of elsticsearch are you using? Can you send me the current
cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared network
I used the local filesystem. Because my root partition only had 8gb (and
this is where I stored the snapshots) the partition got filled up and three
of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2:9300]][cluster/snapshot/get]];
nested: SnapshotMissingException[[production_backup:_snapshot1] is
missing]; nested:
FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No such
file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2:9300]][cluster/snapshot/create]];
nested: ConcurrentSnapshotExecutionException[[production_backup:snapshot_1]
a snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew_Vos · May 24, 2014, 6:18pm

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the current
cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE64QtHJ8FUQOvAzDN%3DXegbm%3DBQ%2BUix1R5akpwTHMsuciFqdCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Andrew_Vos · May 24, 2014, 6:29pm

Right ok here's the cluster state

gist.github.com

https://gist.github.com/AndrewVos/29de3c6735bbd7808a81

gistfile1.txt

{
    "allocations": [],
    "blocks": {},
    "cluster_name": "grappolo",
    "master_node": "LjNI8PH5Qo2UFVoiZ-Y3zA",
    "metadata": {
        "indices": {
            "production": {
                "aliases": [],
                "settings": {

This file has been truncated. show original

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.com wrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the current
cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE64QtEo%2BPimLXv6whLS63hHzzV2rUnu3bxB2tBQTnNRKA3RLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Igor_Motov · May 24, 2014, 6:30pm

I meant the output of the cluster state
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-state.html#cluster-state
command:

curl -XGET 'http://localhost:9200/_cluster/state'

It might be large and will contain information about your cluster that you
might not want to share publicly (index mappings). If this is the case,
please feel free to send it to me by email.

Igor

On Saturday, May 24, 2014 2:18:14 PM UTC-4, Andrew Vos wrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the current
cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e292c3d2-b0bf-438a-986b-e32db3f2dd7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Igor_Motov · May 24, 2014, 6:35pm

It was caused by this bug

Snapshot aborted but but still in progress · Issue #5958 · elastic/elasticsearch · GitHub The only
recovery option right now is full cluster restart.

On Saturday, May 24, 2014 2:30:06 PM UTC-4, Andrew Vos wrote:

Right ok here's the cluster state
gist:29de3c6735bbd7808a81 · GitHub

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.com wrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the current
cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew_Vos · May 24, 2014, 6:38pm

Ok. While you're here, one other question I would like answered:

I have 10 nodes in a cluster. I want to break out three nodes into a
different cluster as a kind of backup to test out this full cluster
restart. Would it be safe to just block the other three nodes from
connecting to the main cluster? Would they form their own?

On Sat, May 24, 2014 at 7:35 PM, Igor Motov imotov@gmail.com wrote:

It was caused by this bug -
Snapshot aborted but but still in progress · Issue #5958 · elastic/elasticsearch · GitHub The only
recovery option right now is full cluster restart.

On Saturday, May 24, 2014 2:30:06 PM UTC-4, Andrew Vos wrote:

Right ok here's the cluster state https://gist.github.com/
AndrewVos/29de3c6735bbd7808a81

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.com wrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the
current cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%
40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE64QtGZ2SKt3SLOXSS1D52jgpHXHNM2E0g8B9qtNWgraSWUuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Igor_Motov · May 24, 2014, 6:51pm

If your cluster is setup correctly (with proper value set
for discovery.zen.minimum_master_nodes) they shouldn't. But if you are
running without discovery.zen.minimum_master_nodes set, they might indeed
form a new cluster. Obviously some shards might end up in one cluster and
not in the other and if you are indexing while this is happening you will
lose some data. I would say it's pretty..... extreme way to test full
cluster restart.

On Saturday, May 24, 2014 2:38:44 PM UTC-4, Andrew Vos wrote:

Ok. While you're here, one other question I would like answered:

I have 10 nodes in a cluster. I want to break out three nodes into a
different cluster as a kind of backup to test out this full cluster
restart. Would it be safe to just block the other three nodes from
connecting to the main cluster? Would they form their own?

On Sat, May 24, 2014 at 7:35 PM, Igor Motov imotov@gmail.com wrote:

It was caused by this bug -
Snapshot aborted but but still in progress · Issue #5958 · elastic/elasticsearch · GitHub The only
recovery option right now is full cluster restart.

On Saturday, May 24, 2014 2:30:06 PM UTC-4, Andrew Vos wrote:

Right ok here's the cluster state https://gist.github.com/
AndrewVos/29de3c6735bbd7808a81

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.comwrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the
current cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over. The
problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/production_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%
40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ae997156-db75-4a2a-8a29-1435a50bb0f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew_Vos · May 24, 2014, 6:53pm

Well it's the only way I can do it without downtime. Unless of course by
"full cluster restart" you mean restarting one node at a time?

On Sat, May 24, 2014 at 7:51 PM, Igor Motov imotov@gmail.com wrote:

If your cluster is setup correctly (with proper value set
for discovery.zen.minimum_master_nodes) they shouldn't. But if you are
running without discovery.zen.minimum_master_nodes set, they might indeed
form a new cluster. Obviously some shards might end up in one cluster and
not in the other and if you are indexing while this is happening you will
lose some data. I would say it's pretty..... extreme way to test full
cluster restart.

On Saturday, May 24, 2014 2:38:44 PM UTC-4, Andrew Vos wrote:

Ok. While you're here, one other question I would like answered:

I have 10 nodes in a cluster. I want to break out three nodes into a
different cluster as a kind of backup to test out this full cluster
restart. Would it be safe to just block the other three nodes from
connecting to the main cluster? Would they form their own?

On Sat, May 24, 2014 at 7:35 PM, Igor Motov imotov@gmail.com wrote:

It was caused by this bug - https://github.com/
elasticsearch/elasticsearch/issues/5958 The only recovery option right
now is full cluster restart.

On Saturday, May 24, 2014 2:30:06 PM UTC-4, Andrew Vos wrote:

Right ok here's the cluster state https://gist.github.com/
AndrewVos/29de3c6735bbd7808a81

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.comwrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the
current cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over.
The problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/prod
uction_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/to
pic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40goo
glegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%
40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ae997156-db75-4a2a-8a29-1435a50bb0f1%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ae997156-db75-4a2a-8a29-1435a50bb0f1%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE64QtGmnSrz7AM-Ad_gkg9UrGCfOPPaMuB57UGOUTP6RUnLSQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Igor_Motov · May 27, 2014, 3:39pm

Yes, by "full cluster restart" I meant shutting down all nodes and then
starting them up again, which means downtime. However, after thinking about
the issue over the long weekend, I wrote a simple utility that cleans up
snapshots without need to restart the cluster

GitHub - imotov/elasticsearch-snapshot-cleanup

On Saturday, May 24, 2014 2:53:27 PM UTC-4, Andrew Vos wrote:

Well it's the only way I can do it without downtime. Unless of course by
"full cluster restart" you mean restarting one node at a time?

On Sat, May 24, 2014 at 7:51 PM, Igor Motov imotov@gmail.com wrote:

If your cluster is setup correctly (with proper value set
for discovery.zen.minimum_master_nodes) they shouldn't. But if you are
running without discovery.zen.minimum_master_nodes set, they might indeed
form a new cluster. Obviously some shards might end up in one cluster and
not in the other and if you are indexing while this is happening you will
lose some data. I would say it's pretty..... extreme way to test full
cluster restart.

On Saturday, May 24, 2014 2:38:44 PM UTC-4, Andrew Vos wrote:

Ok. While you're here, one other question I would like answered:

I have 10 nodes in a cluster. I want to break out three nodes into a
different cluster as a kind of backup to test out this full cluster
restart. Would it be safe to just block the other three nodes from
connecting to the main cluster? Would they form their own?

On Sat, May 24, 2014 at 7:35 PM, Igor Motov imotov@gmail.com wrote:

It was caused by this bug - https://github.com/
elasticsearch/elasticsearch/issues/5958 The only recovery option right
now is full cluster restart.

On Saturday, May 24, 2014 2:30:06 PM UTC-4, Andrew Vos wrote:

Right ok here's the cluster state https://gist.github.com/
AndrewVos/29de3c6735bbd7808a81

On Sat, May 24, 2014 at 7:18 PM, Andrew Vos andrew.vos@gmail.comwrote:

1.0.0. What do you mean by state exactly?

On Sat, May 24, 2014 at 6:33 PM, Igor Motov imotov@gmail.com wrote:

Which version of elsticsearch are you using? Can you send me the
current cluster state?

On Saturday, May 24, 2014 10:17:43 AM UTC-4, Andrew Vos wrote:

A few days ago I started a snapshot, but instead of using a shared
network I used the local filesystem. Because my root partition only had 8gb
(and this is where I stored the snapshots) the partition got filled up and
three of my seven elasticsearch boxes crashed almost instantly.

I've since created a new cluster and let the data replicate over.
The problem now is, I can't seem to cancel this snapshot!

Looking at the snapshot:
curl -XGET "localhost:9999/_snapshot/production_backup/_snapshot1"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/get]]; nested:
SnapshotMissingException[[production_backup:_snapshot1] is missing];
nested: FileNotFoundException[/ebs/snapshot-backup/snapshot-_snapshot1 (No
such file or directory)]; ","status":404}% *

Starting a new snapshot:
curl -XPUT
"localhost:9999/_snapshot/production_backup/snapshot_1?wait_for_completion=false"
*{"error":"RemoteTransportException[[Smuggler][inet[/172.17.0.2
http://172.17.0.2:9300]][cluster/snapshot/create]]; nested:
ConcurrentSnapshotExecutionException[[production_backup:snapshot_1] a
snapshot is already running]; ","status":503}% *

This just never completes:

curl -XDELETE "localhost:9999/_snapshot/prod
uction_backup/snapshot_1"

Any idea how I can solve this?

--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/to
pic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/41136f07-3af
d-4452-87e4-a54f983db539%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/41136f07-3afd-4452-87e4-a54f983db539%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%
40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9573402c-d7fe-419d-866f-cff1196fcfc4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/rtffJxkKyzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ae997156-db75-4a2a-8a29-1435a50bb0f1%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ae997156-db75-4a2a-8a29-1435a50bb0f1%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/97b8d2f5-078f-4a24-b5a1-b97c9b61b87f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Details on snapshot and restore in ES 1.0 Elasticsearch	16	527	July 6, 2017
Taking a snapshot causes nodes to fall out of the cluster Elasticsearch	4	511	May 3, 2022
Snapshot & Restore in a cluster of two nodes Elasticsearch	4	1293	July 6, 2017
ES backups without using snapshots? Elasticsearch	5	1255	July 6, 2017
Snapshot a single node Elasticsearch	4	324	July 6, 2017

Can't stop a snapshot running on my cluster

Related topics