UpdateByQuery API Response returns wrong status

I am facing issue with UpdateByQuery API while trying to update a document which doesn’t exist in Elasticsearch

Problem description

Step 1 - We are creating one index for each day like test_index-2020.03.11, test_index-2020.03.12… and we maintain eight days(today’s as well as last week seven days) indexes.

Step 2 - When data arrives(reading one by one or in a bulk from Kafka topic) either we need to update(which may exist in any one of the 8 days indexes) if data already exist with given ID or save it if not exist(to current day index).

Solution I am trying currently when data arrives one by one:

->Using UpdateByQuery with inline script to update the doc

->If BulkByScrollResponse returns Updated count 0, then save the doc

Issues:

Even if doc doesn’t exist still I can see BulkByScrollResponse returns updated field as non-zero(1,2,3,4…) as follows

BulkIndexByScrollResponse[sliceId=null,updated=1,created=0,deleted=0,batches=1,versionConflicts=0,noops=0,retries=0,throttledUntil=0s]

Due to this unable to trigger document save request.

How to approach if bulk of documents(having set of different doc IDs) need to be updated with their respective content with single request? Will I be able to achieve with UpdateByQuery?

Note: Considering the amount of data to be processed per hour we need to avoid multiple hits to Elasticserach

Please suggest solution for the above mentioned issues.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.