Way to re-index failed documents using BulkProcessor


(Phani Kumar Varma) #1

We are developing a application, where we ingest and bulk index time-series data using native JAVA client.
Here each and every document/event is very important to us, we can't miss single event.

As we are using BulkProcessor to index the data, in case of any failures to index data with malformed JSON or bulk queue unavailability or any other reason, is there any way to track the failed documents/events from BulkResponse?

I tried iterating through BulkResponse @ afterBulk() method, but couldn't find actual document/event.

Our plan is to index all such failed documents/events to separate INDEX (like unprocessed), which don't consider any mappings.

Please help me to identify the failed documents


(Adrien Grand) #2

In the afterBulk method, you should be able to check for BulkResponse.hasFailures(). In case it returns true, you could iterate over response items and index failed ones into your unprocessed index.


(Phani Kumar Varma) #3

I tried your proposal already, but couldn't find a way to get actual document(in this case failed document) with BulkItemResponse. This object is just having id, index and type details, but not actual document.

Am I missing anything important here?


(Adrien Grand) #4

Oh I see. Something useful is that the response at index i in the response maps to the request at index i in the request, so you can get a reference to the ActionRequest that failed, then cast it to an IndexRequest (if you know it is an IndexRequest) and get the source using the .source() method.


(Phani Kumar Varma) #5

Thank you very much, finally able to get source/document with your suggestion.


(system) #6