How to identify message causing error in bulk request

unknownunknown · February 26, 2016, 6:13pm

I'm using BulkProcessor. Something is causing a problem:

[elasticsearch[moo][transport_client_worker][T#19]{New I/O worker #84}] ERROR - Failed to index record: MapperParsingException[failed to parse [_source]]; nested: ElasticsearchParseException[Failed to parse content to map]; nested: JsonParseException[Unexpected character (':' (code 58)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name#012 at [Source: [B@541daa55; line: 1, column: 1555]];

I read this as a JSON parse error in the bulk request. Is that right? How do I turn this into information I can use? I doesn't tell me what index the failure happened in, it doesn't tell me what the message looked like.

                  public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
                            if (response.hasFailures()) {
                                    for (BulkItemResponse item : response.getItems()) {
                                            if (item.isFailed()) {
                                                    logger.error("Failed to index record: " + item.getFailureMessage());
                                            }
                                    }
                            }
                  }

Can I get any context about the message that causes the failure in BulkItemResponse, BulkResponse, or BulkRequest? I tried pulling the "payloads" out of the BulkItemResponse, but this didn't seem to correspond to any kind of message body so I can identify where the malformed message is.

Thanks for any help that can be offered.

nik9000 · February 26, 2016, 6:29pm

Something like this ought to do:

if (false == response.hasFailures()) {
  return;
}
for (int i = 0; i < response.getItems().size()) {
  if (false == response.getItems().get(i).isFailed()) {
    continue;
  }
  logger.error("Failed to index [" + request.requests().get(i) +  "]: [" + response.getItems().get(i).getFailureMessage() + "]");
}

Warning: I wrote this inside a little text box on a web page and didn't run it. It is almost certainly wrong. My only goal was to make it obvious that the requests and responses are kept in the same order.

nik9000 · February 26, 2016, 6:31pm

Its almost certainly ok to skip the first check - it just iterates and items and looks for one with a failure so this doesn't save any time and it makes the code longer.

unknownunknown · February 26, 2016, 6:31pm

Thanks, I will give this a try and get back to you.

unknownunknown · February 26, 2016, 6:54pm

Thanks! I am hoping this will be very helpful, I got significantly more feedback this way. There were some minor Array vs. ArrayList-isms to get it to work for me, but otherwise not bad for pseudocode.

What ultimately worked for me:

                  @Override
                  public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
                            if (response.hasFailures()) {
                                    for (int i = 0; i < response.getItems().length; i++) {
                                        BulkItemResponse item = response.getItems()[i];
                                        if (item.isFailed()) {
                                              IndexRequest ireq = (IndexRequest) request.requests().get(i);
                                              logger.error("Failed while indexing to " + item.getIndex() + " type " + item.getType() + " " +
                                                           "request: [" + ireq + "]: [" + item.getFailureMessage() + "]");
                                        }
                                    }
                            }
                  }

I now get:

[elasticsearch[Dyna-Mite][transport_client_worker][T#5]{New I/O worker #70}] ERROR - Failed while indexing to data-2016.02.26 type datatype request: [index {[data-2016.02.26][datatype][null], source[{"json_obj"}]: [MapperParsingException[failed to parse [_source]]; nested: ElasticsearchParseException[Failed to parse content to map]; nested: JsonParseException[Unexpected character (':' (code 58)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name#012 at [Source: [B@5f09799a; line: 1, column: 1779]]; ]

I am hopeful that this will be a great help in troubleshooting this problem, thanks.

unknownunknown · February 26, 2016, 8:57pm

Just wanted to say thanks again. This was a real help for me and allowed me to move forward in my work rather than being frustrated and wondering what was going wrong.

nik9000 · February 26, 2016, 9:13pm

Sure! I'm glad I could help!

I am a bit frustrated that the Client interface mixes lists and arrays arbitrarily. We'll get a real java client soon-ish without all the dependencies for Elasticsearch's core and I'll try to do some of the code reviews for it so I can make sure it is consistent about which one it uses.

Ivan · February 26, 2016, 10:32pm

As if working with JSON will make it any easier!

jprante · February 28, 2016, 12:09am

I am also curious whether the new Java HTTP client will just throw JSON over the fence, or if it will parse JSON into a new improved ES API.

Ivan · February 28, 2016, 1:21am

I believe their goal is to have a consistent API between all their clients.
My guess is JSON.

The existing binary API simply throws Map<String, Object> over the wire,
and no one really has written a good ORM for Elasticsearch. Everyone is
using Jackson databinding or GSON. Since the client will be standalone, I
wonder if ES will continue using Jackson.

Ivan

Topic		Replies	Views
Bulk Insert error recovery Elasticsearch	3	1895	July 6, 2017
Way to re-index failed documents using BulkProcessor Elasticsearch	5	1286	July 5, 2017
[Java] BulkProcessor- custom data in request-response items Elasticsearch	5	1593	December 4, 2017
Handling failures Elasticsearch	1	282	July 6, 2017
Identify, save and resend failed requests in BulkProcessor Elasticsearch	2	668	April 20, 2017

How to identify message causing error in bulk request

Related topics