Enconding Issue


(Walendo) #1

Hi,

I'm trying to PUT an index that is something like the String below
(portuguese) which is in the ISO-8859-1 charset and the Elastic Server
sends this error:

JsonParseException[Invalid UTF-8 middle byte 0x6f\n at [Source:
[B@dd8904; line: 6, column: 58]];

Is there anyway to tell the parser the appopriate encoding to use?

{
"previsao_cidade" : {
"codigo" : 6377,"previsao": {
"data" : "08/04/2010",
"nome_dia" : "Quinta",
"frase" : "Sol com muitas nuvens durante o dia e períodos de céu
nublado. Noite com muitas nuvens.",
"minima" : "23°C",
"maxima" : "29°C",
"probabilidade" : "0%",
"precipitacao" : "0mm",
"ico_manha" : "2r",
"ico_tarde" : "2r",
"ico_noite" : "2rn",
"uv" : "15",
"dir_vento" : "SSE",
"int_max_vento" : "15 nós",
"int_vento" : "10 nós",
"umidade" : "86 %",
"solnascente" : "05h31",
"solpoente" : "17h28"
}
}

Thanks.
Jorge


(Clinton Gormley) #2

Hiya

I'm trying to PUT an index that is something like the String below
(portuguese) which is in the ISO-8859-1 charset and the Elastic Server
sends this error:

The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt) specifies that JSON
must be encoded in Unicode.

So if you're using ISO-8859* in your application, then your JSON encoder
(or you, before it reaches the JSON encoder) needs to convert your text
to Unicode (eg UTF-8) and then will need to convert the results from
ElasticSearch back into ISO-8859

clint

Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Shay Banon) #3

Yep, elasticsearch only support unicode (UTF-8) for JSON and produces
unicode.

-shay.banon

On Thu, Apr 8, 2010 at 5:38 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

Hiya

I'm trying to PUT an index that is something like the String below
(portuguese) which is in the ISO-8859-1 charset and the Elastic Server
sends this error:

The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt) specifies that JSON
must be encoded in Unicode.

So if you're using ISO-8859* in your application, then your JSON encoder
(or you, before it reaches the JSON encoder) needs to convert your text
to Unicode (eg UTF-8) and then will need to convert the results from
ElasticSearch back into ISO-8859

clint

Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Walendo) #4

Great! Thanks a lot.

On 8 abr, 11:41, Shay Banon shay.ba...@elasticsearch.com wrote:

Yep, elasticsearch only support unicode (UTF-8) for JSON and produces
unicode.

-shay.banon

On Thu, Apr 8, 2010 at 5:38 PM, Clinton Gormley clin...@iannounce.co.ukwrote:

Hiya

I'm trying to PUT an index that is something like the String below
(portuguese) which is in the ISO-8859-1 charset and the Elastic Server
sends this error:

The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt) specifies that JSON
must be encoded in Unicode.

So if you're using ISO-8859* in your application, then your JSON encoder
(or you, before it reaches the JSON encoder) needs to convert your text
to Unicode (eg UTF-8) and then will need to convert the results from
ElasticSearch back into ISO-8859

clint

Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(system) #5