Content encoding issues


(jp.lorandi) #1

I'm currently attempting to index documents encoded with UTF-8 and I'm getting strange garbage out of _source when retrieving data. Has anybody had the same problem? Which encoding does ES use out of the box?

TIA,JP


(Shay Banon) #2

UTF8.

On Thursday, July 14, 2011 at 6:21 AM, jp.lorandi@cfyar.com wrote:

I'm currently attempting to index documents encoded with UTF-8 and I'm getting strange garbage out of _source when retrieving data. Has anybody had the same problem? Which encoding does ES use out of the box?

TIA,
JP


(ppearcy) #3

I can vouch that the encoding support is solid. Store English, various
European, Chinese, Japanse in UTF8 with no problem. Most likely it is
getting kludged client side. Do you see encoding issues when you
execute from the browser with a JSON plugin?

On Jul 13, 9:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

UTF8.

On Thursday, July 14, 2011 at 6:21 AM, jp.lora...@cfyar.com wrote:

I'm currently attempting to index documents encoded with UTF-8 and I'm getting strange garbage out of _source when retrieving data. Has anybody had the same problem? Which encoding does ES use out of the box?

TIA,
JP


(Shay Banon) #4

No need to vouch :), can you provide a recreation?

On Thursday, July 14, 2011 at 6:48 AM, Paul wrote:

I can vouch that the encoding support is solid. Store English, various
European, Chinese, Japanse in UTF8 with no problem. Most likely it is
getting kludged client side. Do you see encoding issues when you
execute from the browser with a JSON plugin?

On Jul 13, 9:22 pm, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

UTF8.

On Thursday, July 14, 2011 at 6:21 AM, jp.lora...@cfyar.com (http://cfyar.com) wrote:

I'm currently attempting to index documents encoded with UTF-8 and I'm getting strange garbage out of _source when retrieving data. Has anybody had the same problem? Which encoding does ES use out of the box?

TIA,
JP


(system) #5