_update_by_query for non-refreshed documents

I have an index products. What I need is to update categoryDisplayName for all products with given categoryName. Whenever my application gets an event to update categoryName the app saves it and fills it into newly created product documents, so new products always have up-to-date categoryName. Such event should also trigger categoryName update for products already existing in the elastic index.

For this purpose I've used update by query. However I struggle with keeping the consistency of my data, because when update by query is used, there is a time window in which some recently created documents might not be refreshed yet (and won't be queried by my update_by_query request).

The problem is - even though my application received "Change Category Name" event and had already saved a product, if this particular product happens to be in the unlucky time window its document will simply have wrong data, its category will never be updated.

How to deal with it? Should I refresh right before doing update_by_query? It can also lead to a (much smaller) time window between _refresh request and _update_by_query request. Or is there any other way to deal with such inconsistencies?

Welcome to our community! :smiley:

That's probably the best option.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.