Hey all,
I just performed several tests.
Assume I have the word "愛", its utf8 coding is "\xe6\x84\x9b", and its
unicode is '\u611b'.
I execute the following command:
curl -XPOST 'http://192.168.50.7:9200/data/main" -d '{"name":"愛"}'
--> {"_index":"data","_type":"main","_id":"bFPGRGFaTcqsS1hOfWVjDQ","_version":1,"created":true}
curl -XPOST 'http://192.168.50.7:9200/data/main' -d '{"name":"\u611b"}'
--> {"_index":"data","_type":"main","_id":"NntYqlwRQl6QZ3ROAclU5w","_version":1,"created":true}
curl -XPOST 'http://192.168.50.7:9200/data/main' -d
'{"name":"\xe6\x84\x9b"}'
--> {"error":"MapperParsingException[failed to parse [name]]; nested:
JsonParseException[Unrecognized character escape 'x' (code 120)\n at
[Source: [B@b41c580; line: 1, column: 12]]; ","status":400}
It seems obvious that json parse exception occur. But what's the difference
between '{"name":"\xe6\x84\x9b"}' and '{"name":"愛"}' ?
PS. My native lang is utf8.
And next I try to find them by using :
curl 'http://192.168.50.7:9200/data/main/_search?pretty' -d
'{"query":{"term":{"name": "愛"}}}'
I think it might get only one hit. But it get the two hits. And why? It
seem there are some transformation inside it.
Ideas?
Thanks a lot.
Ivan Ji於 2014年3月4日星期二UTC+8上午11時57分33秒寫道:
Hi all,
I am wondering about the encoding of ES. What kind of the encoding is of
the ES storage? And what's the encoding during the operations?
Through the REST API, any json document can be sent to insert into ES. So
if I sent an document as follows
curl -XPOST 'http://192.168.50.7:9200/data/main/' -d '{"name":"蒼天",
"type":"file", "extension":"tmp", "mime_type": "application/text"}'
As we can see the "name" field is not ascii and assume my native encoding
is UTF8. What happened during the insertion?
Does it store the original string which is utf8 inside into ES? If the
native encoding is not common, ex. Latin, what would happen?
Ideas?
Regards
Ivan
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/593a6fbe-7e37-4f82-b706-f92bc0785d41%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.