Elasticsearch 1.2.2 snapshot throws error but still completes?

So I'm running into a bizarre situation where I'm getting a snapshot using
curator with the following output:

$ curator --host haproxyStaging snapshot --repository my_s3_repository --
older-than 5

2014-09-22 12:51:28,000 INFO Job starting...
2014-09-22 12:51:28,000 INFO Default timeout of 30 seconds is too low
for command SNAPSHOT. Overriding to 21,600 seconds (6 hours).
2014-09-22 12:51:28,015 INFO Beginning SNAPSHOT operations...
2014-09-22 12:51:28,028 INFO Attempting to create snapshot for index
logstash-2014.09.16.

Traceback (most recent call last):
File "/bin/curator", line 8, in
load_entry_point('elasticsearch-curator==1.2.2', 'console_scripts',
'curator')()
File "/lib/python2.7/site-packages/curator/curator.py", line 731, in main
arguments.func(client, **argdict)
File "/lib/python2.7/site-packages/curator/curator.py", line 585, in
command_loop
skipped = op(client, index_name, **kwargs)
File "/lib/python2.7/site-packages/curator/curator.py", line 406, in
_create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body=
body, wait_for_completion=wait_for_completion)
File "/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68
, in _wrapped
return func(*args, params=params, **kwargs)
File "/lib/python2.7/site-packages/elasticsearch/client/snapshot.py",
line 19, in create
repository, snapshot), params=params, body=body)
File "/lib/python2.7/site-packages/elasticsearch/transport.py", line 284,
in perform_request
status, headers, data = connection.perform_request(method, url, params,
body, ignore=ignore, timeout=timeout)
File
"/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py",
line 55, in perform_request
self._raise_error(response.status, raw_data)
File "/lib/python2.7/site-packages/elasticsearch/connection/base.py",
line 97, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code,
error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[my_s3_repository:logstash-2014.09.16]
a snapshot is already running]')

but if I check the cluster state I see that the snapshot is starting and
seems to be running and is not in an aborted state, depending on how long
it's been some of the nodes will show successful:
"snapshots" : {
"snapshots" : [ {
"repository" : "my_s3_repository",
"snapshot" : "logstash-2014.09.16",
"include_global_state" : false,
"state" : "STARTED",
"indices" : [ "logstash-2014.09.16" ],
"shards" : [ {
"index" : "logstash-2014.09.16",
"shard" : 0,
"state" : "SUCCESS",
"node" : "K5iGyuMbSHWMOIjIiKmZRg"
}, {
"index" : "logstash-2014.09.16",
"shard" : 1,
"state" : "SUCCESS",
"node" : "pusv6F-XRPqzmTHtdhijTg"
}, {
"index" : "logstash-2014.09.16",
"shard" : 2,
"state" : "INIT",
"node" : "XctLsyHWQW-mYdkzAai7Ww"
}, {
"index" : "logstash-2014.09.16",
"shard" : 3,
"state" : "INIT",
"node" : "XctLsyHWQW-mYdkzAai7Ww"
}, {
"index" : "logstash-2014.09.16",
"shard" : 4,
"state" : "INIT",
"node" : "pusv6F-XRPqzmTHtdhijTg"
} ]
} ]
}
},

is there a way to get rid of these false positives so that I actually know
when there's a real failure and not just some weird state happening?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/973c682c-853e-45ac-b013-77d966b1e71c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Also I should mention that if I try to take multiple snapshots using
curator for more than 1 index it will complete the first one having thrown
the error but will do the same thing if I try to run again for the next
index in the set that I'm trying to take. So I end up having to repeat the
process to get all the snapshots I need. It doesn't make sense to me and
the only issues I've seen when looking online are all talking about
snapshots that fail in an aborted state so it seems like this is different.
BTW this also happens using curl, it's not just curator.

On Monday, September 22, 2014 1:00:12 PM UTC-5, rhea ghosh wrote:

So I'm running into a bizarre situation where I'm getting a snapshot using
curator with the following output:

$ curator --host haproxyStaging snapshot --repository my_s3_repository --
older-than 5

2014-09-22 12:51:28,000 INFO Job starting...
2014-09-22 12:51:28,000 INFO Default timeout of 30 seconds is too
low for command SNAPSHOT. Overriding to 21,600 seconds (6 hours).
2014-09-22 12:51:28,015 INFO Beginning SNAPSHOT operations...
2014-09-22 12:51:28,028 INFO Attempting to create snapshot for index
logstash-2014.09.16.

Traceback (most recent call last):
File "/bin/curator", line 8, in
load_entry_point('elasticsearch-curator==1.2.2', 'console_scripts',
'curator')()
File "/lib/python2.7/site-packages/curator/curator.py", line 731, in
main
arguments.func(client, **argdict)
File "/lib/python2.7/site-packages/curator/curator.py", line 585, in
command_loop
skipped = op(client, index_name, **kwargs)
File "/lib/python2.7/site-packages/curator/curator.py", line 406, in
_create_snapshot
client.snapshot.create(repository=repository, snapshot=snap_name, body
=body, wait_for_completion=wait_for_completion)
File "/lib/python2.7/site-packages/elasticsearch/client/utils.py", line
68, in _wrapped
return func(*args, params=params, **kwargs)
File "/lib/python2.7/site-packages/elasticsearch/client/snapshot.py",
line 19, in create
repository, snapshot), params=params, body=body)
File "/lib/python2.7/site-packages/elasticsearch/transport.py", line 284
, in perform_request
status, headers, data = connection.perform_request(method, url, params
, body, ignore=ignore, timeout=timeout)
File
"/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py",
line 55, in perform_request
self._raise_error(response.status, raw_data)
File "/lib/python2.7/site-packages/elasticsearch/connection/base.py",
line 97, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code,
error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(503, u'ConcurrentSnapshotExecutionException[[my_s3_repository:logstash-2014.09.16]
a snapshot is already running]')

but if I check the cluster state I see that the snapshot is starting and
seems to be running and is not in an aborted state, depending on how long
it's been some of the nodes will show successful:
"snapshots" : {
"snapshots" : [ {
"repository" : "my_s3_repository",
"snapshot" : "logstash-2014.09.16",
"include_global_state" : false,
"state" : "STARTED",
"indices" : [ "logstash-2014.09.16" ],
"shards" : [ {
"index" : "logstash-2014.09.16",
"shard" : 0,
"state" : "SUCCESS",
"node" : "K5iGyuMbSHWMOIjIiKmZRg"
}, {
"index" : "logstash-2014.09.16",
"shard" : 1,
"state" : "SUCCESS",
"node" : "pusv6F-XRPqzmTHtdhijTg"
}, {
"index" : "logstash-2014.09.16",
"shard" : 2,
"state" : "INIT",
"node" : "XctLsyHWQW-mYdkzAai7Ww"
}, {
"index" : "logstash-2014.09.16",
"shard" : 3,
"state" : "INIT",
"node" : "XctLsyHWQW-mYdkzAai7Ww"
}, {
"index" : "logstash-2014.09.16",
"shard" : 4,
"state" : "INIT",
"node" : "pusv6F-XRPqzmTHtdhijTg"
} ]
} ]<span style="color:
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8d12ae97-b5bc-4ca5-b8da-1720f5a2c7db%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.