Bulk API possible bug


(Pablo Musa) #1

Hi guys,
today I was using the bulk API and the data was loading just fine into
Elasticsearch.
However, when querying Elasticsearch the resulting JSON (apparently ok) was
invalid, with an extra comma

"hits": [{
...
"_source":{...},
},
{
...
"_source":{...},
},
}]

After a long time, I found out that my data file for the bulk had a ',' at
the end of the data line as:
{"index":{"_index":"XXX","_id":222,"_type":"YYY"}}
{"id":222, "test": name},

I am not sure if that is the expected behavior, but it took me a long time
to find out :frowning:
First, there was no error at bulk, second Elasticsearch is not raising any
error in query (although it returns a invalid json with the information).
Finally, Kibana was not showing any error when trying to use the index.

Thanks,
Pablo

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0d73335f-7c02-4367-bae3-a4077b12daec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Brian Yoder) #2

Hi, Pablo.

I remember reading that Elasticsearch will happily store an invalid JSON
string as your _source.

From my usage of the Java API, I noticed that the Jackson library is used,
but that only the stream parser is present. What this tells me is that ES
is likely parsing your JSON token-by-token and has processed and indexed
most of it. In other words, an error isn't an all-or-nothing situation.
Since your syntax error happens at the very end of the document,
Elasticsearch has indexed all of the document before it encounters the
error.

My guess is that if the error was not at the very end of the document, then
Elasticsearch would fail to process and index any information past the
error, but would successfully process and index information (if any) before
the error.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/042fcbfd-9575-4543-b6b1-2328af05b1fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Pablo Musa) #3

Thanks for the answer Brian!!

Regards,
Pablo

2014-06-23 16:24 GMT-03:00 Brian brian.from.fl@gmail.com:

Hi, Pablo.

I remember reading that Elasticsearch will happily store an invalid JSON
string as your _source.

From my usage of the Java API, I noticed that the Jackson library is used,
but that only the stream parser is present. What this tells me is that ES
is likely parsing your JSON token-by-token and has processed and indexed
most of it. In other words, an error isn't an all-or-nothing situation.
Since your syntax error happens at the very end of the document,
Elasticsearch has indexed all of the document before it encounters the
error.

My guess is that if the error was not at the very end of the document,
then Elasticsearch would fail to process and index any information past the
error, but would successfully process and index information (if any) before
the error.

Brian

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/n-6920nqaVg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/042fcbfd-9575-4543-b6b1-2328af05b1fe%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/042fcbfd-9575-4543-b6b1-2328af05b1fe%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF6PhFJEYyiah9kQWgCB1tK8bm8Me_xpa7hY21ef7T3gikXRcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4