Way to re-index failed documents using BulkProcessor

We are developing a application, where we ingest and bulk index time-series data using native JAVA client.
Here each and every document/event is very important to us, we can't miss single event.

As we are using BulkProcessor to index the data, in case of any failures to index data with malformed JSON or bulk queue unavailability or any other reason, is there any way to track the failed documents/events from BulkResponse?

I tried iterating through BulkResponse @ afterBulk() method, but couldn't find actual document/event.

Our plan is to index all such failed documents/events to separate INDEX (like unprocessed), which don't consider any mappings.

Please help me to identify the failed documents

In the afterBulk method, you should be able to check for BulkResponse.hasFailures(). In case it returns true, you could iterate over response items and index failed ones into your unprocessed index.

I tried your proposal already, but couldn't find a way to get actual document(in this case failed document) with BulkItemResponse. This object is just having id, index and type details, but not actual document.

Am I missing anything important here?

Oh I see. Something useful is that the response at index i in the response maps to the request at index i in the request, so you can get a reference to the ActionRequest that failed, then cast it to an IndexRequest (if you know it is an IndexRequest) and get the source using the .source() method.

1 Like

Thank you very much, finally able to get source/document with your suggestion.

1 Like