How to load a json file that contains special characters

AussiePete2015 · November 25, 2016, 2:47am

Hi all,

I have created a json file using Talend to load the text from sas code and transform it into json,
I have created an index however, the import fails because the sas code contains many different symbols

e.g.
{"index":{"_index":"sascode_idx", "_type":"content", "_id": "1"}}
{"BuildAllTriangles":[{"content":"/**************************************************************************\r\n* PROGRAM NAME : BuildAllTriangles.sas\r\n* PROGRAMMER : Peter Birk\r\n* DATE WRITTEN : 20120912\r\n* DESCRIPTION : \r\n\r\n\tMake lots of liability triangles in an improbably short amount of\r\n\tdevelopment time available.\r\n\r\n* DEPENDENCIES :\r\n\r\n\tRawFiles\Reference\CC\Reserving Triangle Delivery.xls\r\n\r\n\tMacro variables from SplitTransByReservingClass:\r\n\r\n\t&&SplitData&n..\r\n\t&SplitDataCount.\r\n\r\n* OUTPUTS"}]}

As you can see this is a json array which I've verified via http://jsonviewer.stack.hu/

I can load this json file into MongoDB but obviously there is an issue with elasticsearch and the characters in the content.

How can I modify the content to be accepted into Elasticsearch without altering the content too drastically?

Cheers

guilherme_maranhao · November 30, 2016, 10:25am

Hi Aussie,

We've faced a similar issue in our indexing process. What we've done was removing all the special characters with a gsub method (our interface language to Elasticsearch is Ruby):

content = content.gsub(/[\“\”\"\'\\\']/m, ' ').gsub(/[\n\t\r]/m, ' ').gsub(/\s+/m, ' ').strip

Hope it works for you!

Guilherme

system · December 28, 2016, 10:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
I want to index elasticsearch query Elasticsearch	2	360	July 6, 2017
JsonParseException: Illegal unquoted character ((CTRL-CHAR, code 9)): has to be escaped Elasticsearch	3	40705	July 6, 2017
Words with some special character arenot displayed well after elasticsearch injection Elasticsearch language-clients	9	1004	July 14, 2021
ES special chars issue Elasticsearch	1	364	June 8, 2018
Single quote json Elasticsearch	4	1197	November 4, 2022

How to load a json file that contains special characters

Related topics