Where is the exception in “json_parse_exception”?


(Rony Armon) #1

I'm using Kibana/ES 6.5 to index around 470 documents crawled from a Hebrew website with:
POST /index_name/_doc/_bulk

I can post/get few individual documents but some probably carry some illegal characters as when I'm posting all I'm getting the following error message:

{
"error": {
"root_cause": [
{
"type": "json_parse_exception",
"reason": "Unexpected character ('×' (code 215)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@138e10d; line: 1, column: 2]"
}
],
"type": "json_parse_exception",
"reason": "Unexpected character ('×' (code 215)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@138e10d; line: 1, column: 2]"
},
"status": 500
}

Is there a way to find out which of the documents contain the characters that have led to this error?


(David Turner) #2

The responses from a bulk request come back in the same order as the request:

The response to a bulk action is a large JSON structure with the individual results of each action that was performed in the same order as the actions that appeared in the request. The failure of a single action does not affect the remaining actions.

Thus if this error appears in the 4th entry of the items array in the response then it's the 4th document that has a problem.


(David Turner) #3

Sorry, I just saw this. I think this indicates that the whole request was malformed, rather than any individual document. You must alternate lines like this:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

Perhaps there is a newline embedded in one or more of your documents?


(Rony Armon) #4

Thanks. I've cleaned the documents of newlines so the only option is to search for it manually which is what I wanted to avoid. Regarding your first answer, I don't see any entry of items in the response, only line/column reference.