[Hadoop] Hadoop plugin indices with ':' colon - not able to snapshot (?)


(Mateusz Kaczynski) #1

Whenever I try to take a snapshot of an index which contains a ':' colon in
its name,I end up with the following trace:

{
"error":"IllegalArgumentException[java.net.URISyntaxException: Relative
path in absolute URI: crawl:1]; nested: URISyntaxException[Relative path in
absolute URI: crawl:1]; ",
"status":500
}

It does not matter if the 'indices' argument is provided or not, the
snapshot name is set to 'snapshot'. I have tried to specify the index
escaping the sign with '%3a' but in this case the name does not fit any
available indices.

I assume this is related to https://issues.apache.org/jira/browse/HDFS-13
filenames with ':' colon throws java.lang.IllegalArgumentException?

The question is, is there a way to somehow escape the character (if not
within the request then perhaps code itself?) and if so, would creating a
feature request make sense?

Many thanks,
Mateusz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ce4b5ffb-f473-4862-8b67-bec1f08bd840%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #2

Hi,

':' has a special meaning in an URI, which is what HDFS uses. You basically have to either escape the character (%3A) or
use a different character.
Potentially you can rename the file to the desired name by running a separate command after the job has been completed.

However, no URI/URL can be constructed from your file-name, in the format you desire, so you'll have to encode/decode
the location every-time.

On 9/1/14 8:10 PM, Mateusz Kaczynski wrote:

Whenever I try to take a snapshot of an index which contains a ':' colon in its name,I end up with the following trace:

{
"error":"IllegalArgumentException[java.net.URISyntaxException: Relative path in absolute URI: crawl:1]; nested:
URISyntaxException[Relative path in absolute URI: crawl:1]; ",
"status":500
}

It does not matter if the 'indices' argument is provided or not, the snapshot name is set to 'snapshot'. I have tried to
specify the index escaping the sign with '%3a' but in this case the name does not fit any available indices.
I assume this is related to https://issues.apache.org/jira/browse/HDFS-13 filenames with ':' colon throws
java.lang.IllegalArgumentException?

The question is, is there a way to somehow escape the character (if not within the request then perhaps code itself?)
and if so, would creating a feature request make sense?

Many thanks,
Mateusz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ce4b5ffb-f473-4862-8b67-bec1f08bd840%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ce4b5ffb-f473-4862-8b67-bec1f08bd840%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5404AE44.3010800%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mateusz Kaczynski) #3

Hi Costin, thanks for the response.

Yes, I understand that this is a restricted character and would require
escaping.

There are perhaps 2 separate issues here:

  1. If 'indices' is not specified, i.e.
    curl -XPUT "localhost:9200/_snapshot/hdfs-cluster/snapshot_1"
    elasticsearch is going to complain with the mentioned error when it finds
    any index with colon in its name and stop.

  2. If 'indices' is in use, I don't see a way(would be great to be proved
    wrong) to escape the name, i.e. doing something like this
    curl -XPUT "localhost:9200/_snapshot/hdfs-cluster/snapshot_2" -d '{"indices":
    "crawl%3A1"}'
    finishes empty as the index with such a name does not exist.

That's why I though escaping might be done somewhere in the plugin?

Mateusz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e635a780-4b11-44b0-b435-243fd22a8fb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #4

If it's only the index that you are interested, why not use aliases instead?
Have the index snapshot/restored through a legal fs name (crawl-1) and have an alias
that uses the special character (crawl:1) pointing to it?

On 9/2/14 2:20 PM, Mateusz Kaczynski wrote:

Hi Costin, thanks for the response.

Yes, I understand that this is a restricted character and would require escaping.

There are perhaps 2 separate issues here:

  1. If 'indices' is not specified, i.e.
    |
    curl -XPUT "localhost:9200/_snapshot/hdfs-cluster/snapshot_1"
    |
    elasticsearch is going to complain with the mentioned error when it finds any index with colon in its name and stop.

  2. If 'indices' is in use, I don't see a way(would be great to be proved wrong) to escape the name, i.e. doing something
    like this
    |
    curl -XPUT "localhost:9200/_snapshot/hdfs-cluster/snapshot_2"-d '{"indices": "crawl%3A1"}'
    |
    finishes empty as the index with such a name does not exist.

That's why I though escaping might be done somewhere in the plugin?

Mateusz

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e635a780-4b11-44b0-b435-243fd22a8fb4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e635a780-4b11-44b0-b435-243fd22a8fb4%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5405B22C.5090509%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mateusz Kaczynski) #5

I'm afraid it's the other way around, i.e. we already have 50 indices or so
(+tools) using colon in the base name.

And just to check that as well, specifying alias in 'indices' instead of
the base name still results in the same error.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/00f0f6ba-30f5-40d1-a414-6d8a55ed099a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #6

If the alias method doesn't work for you then escaping might be the only way (assuming there's going to be a reliable
way to apply it) - I suggest raising an issue since this sounds like a generic functionality that could work across all
snapshot/restore plugins.

Cheers,

On 9/2/14 4:03 PM, Mateusz Kaczynski wrote:

I'm afraid it's the other way around, i.e. we already have 50 indices or so (+tools) using colon in the base name.

And just to check that as well, specifying alias in 'indices' instead of the base name still results in the same error.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/00f0f6ba-30f5-40d1-a414-6d8a55ed099a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/00f0f6ba-30f5-40d1-a414-6d8a55ed099a%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5405D8AD.9030700%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mateusz Kaczynski) #7

Thanks. Yes, will try to dig around the plugin code.

Not entirely sure I get how generic this is, my understanding was that's it
was more HDFS-specific.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/658c8899-6467-49b0-9946-2e7af7ea17e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #8