I'm using BulkProcessor. Something is causing a problem:
[elasticsearch[moo][transport_client_worker][T#19]{New I/O worker #84}] ERROR - Failed to index record: MapperParsingException[failed to parse [_source]]; nested: ElasticsearchParseException[Failed to parse content to map]; nested: JsonParseException[Unexpected character (':' (code 58)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name#012 at [Source: [B@541daa55; line: 1, column: 1555]];
I read this as a JSON parse error in the bulk request. Is that right? How do I turn this into information I can use? I doesn't tell me what index the failure happened in, it doesn't tell me what the message looked like.
public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
if (response.hasFailures()) {
for (BulkItemResponse item : response.getItems()) {
if (item.isFailed()) {
logger.error("Failed to index record: " + item.getFailureMessage());
}
}
}
}
Can I get any context about the message that causes the failure in BulkItemResponse, BulkResponse, or BulkRequest? I tried pulling the "payloads" out of the BulkItemResponse, but this didn't seem to correspond to any kind of message body so I can identify where the malformed message is.
if (false == response.hasFailures()) {
return;
}
for (int i = 0; i < response.getItems().size()) {
if (false == response.getItems().get(i).isFailed()) {
continue;
}
logger.error("Failed to index [" + request.requests().get(i) + "]: [" + response.getItems().get(i).getFailureMessage() + "]");
}
Warning: I wrote this inside a little text box on a web page and didn't run it. It is almost certainly wrong. My only goal was to make it obvious that the requests and responses are kept in the same order.
Its almost certainly ok to skip the first check - it just iterates and items and looks for one with a failure so this doesn't save any time and it makes the code longer.
Thanks! I am hoping this will be very helpful, I got significantly more feedback this way. There were some minor Array vs. ArrayList-isms to get it to work for me, but otherwise not bad for pseudocode.
What ultimately worked for me:
@Override
public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
if (response.hasFailures()) {
for (int i = 0; i < response.getItems().length; i++) {
BulkItemResponse item = response.getItems()[i];
if (item.isFailed()) {
IndexRequest ireq = (IndexRequest) request.requests().get(i);
logger.error("Failed while indexing to " + item.getIndex() + " type " + item.getType() + " " +
"request: [" + ireq + "]: [" + item.getFailureMessage() + "]");
}
}
}
}
I now get:
[elasticsearch[Dyna-Mite][transport_client_worker][T#5]{New I/O worker #70}] ERROR - Failed while indexing to data-2016.02.26 type datatype request: [index {[data-2016.02.26][datatype][null], source[{"json_obj"}]: [MapperParsingException[failed to parse [_source]]; nested: ElasticsearchParseException[Failed to parse content to map]; nested: JsonParseException[Unexpected character (':' (code 58)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name#012 at [Source: [B@5f09799a; line: 1, column: 1779]]; ]
I am hopeful that this will be a great help in troubleshooting this problem, thanks.
Just wanted to say thanks again. This was a real help for me and allowed me to move forward in my work rather than being frustrated and wondering what was going wrong.
I am a bit frustrated that the Client interface mixes lists and arrays arbitrarily. We'll get a real java client soon-ish without all the dependencies for Elasticsearch's core and I'll try to do some of the code reviews for it so I can make sure it is consistent about which one it uses.
I believe their goal is to have a consistent API between all their clients.
My guess is JSON.
The existing binary API simply throws Map<String, Object> over the wire,
and no one really has written a good ORM for Elasticsearch. Everyone is
using Jackson databinding or GSON. Since the client will be standalone, I
wonder if ES will continue using Jackson.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.