I am facing issue with UpdateByQuery API while trying to update a document which doesn’t exist in Elasticsearch
Problem description
Step 1 - We are creating one index for each day like test_index-2020.03.11, test_index-2020.03.12… and we maintain eight days(today’s as well as last week seven days) indexes.
Step 2 - When data arrives(reading one by one or in a bulk from Kafka topic) either we need to update(which may exist in any one of the 8 days indexes) if data already exist with given ID or save it if not exist(to current day index).
Solution I am trying currently when data arrives one by one:
->Using UpdateByQuery with inline script to update the doc
->If BulkByScrollResponse returns Updated count 0, then save the doc
Issues:
Even if doc doesn’t exist still I can see BulkByScrollResponse returns updated field as non-zero(1,2,3,4…) as follows
BulkIndexByScrollResponse[sliceId=null,updated=1,created=0,deleted=0,batches=1,versionConflicts=0,noops=0,retries=0,throttledUntil=0s]
Due to this unable to trigger document save request.
How to approach if bulk of documents(having set of different doc IDs) need to be updated with their respective content with single request? Will I be able to achieve with UpdateByQuery?
Note: Considering the amount of data to be processed per hour we need to avoid multiple hits to Elasticserach
Please suggest solution for the above mentioned issues.