Two successive UpdateByQuery, How to refresh after first one

Hi all,

I have two updatebyquery like this one.

UpdateByQueryRequestBuilder ubqrb = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
ubqrb.source(index).source().setTypes(itemType);
BulkIndexByScrollResponse response = ubqrb.setSlices(2).setRequestsPerSecond(100000F)
                        .setMaxRetries(1000).abortOnVersionConflict(false).script(actualScript)
                        .filter(query)
                        .refresh(true).get();

the second one's response contains version conflicts. if I wait few seconds between both, I got a coherent result without version conflict. Is there a better way to do this or a refresh that make the updates of the first query visible.

Thx for your help.

.refresh(true) on the first one should make it visible to the second one.

A less intrusive option is to use refresh(RefreshPolicy.WAIT_FOR) but that isn't supported by update by query. I just never got around to supporting it, mostly because I wasn't really sure of the "right" way to do it.

BTW, .setRequestsPerSecond(100000F) is pretty much the same as not setting it at all.

Hi Nik,
Thx for your answer, the problem is even when I use .refresh(true) I can't find a way to tell the second query to wait for the first one to finish the refresh, in my case the first query is much heavier than the second one. I changed the partial update with script to update by query because I didn't find a way to solve this. So I really need the second query to wait for the refresh of the first one to finish in elasticsearch. I don't know if I can check the information "My Index has a refresh in progress" and wait while it's true before sending the second on, is it possible?

Thax a lot for the reply

When you set refresh(true) elasticsearch will only response after the refresh is done.
If the second query is in the same thread after the first one, it should work.

Are you doing that from 2 different threads?

The same thread:

for(int i=0; i<scripts.length; i++) {
                    Script actualScript = new Script(ScriptType.INLINE, "painless", scripts[i], scriptParams[i]);

                    UpdateByQueryRequestBuilder ubqrb = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
                    ubqrb.source(index).source().setTypes(itemType);
                    BulkIndexByScrollResponse response = ubqrb.setSlices(2)
                            .setMaxRetries(1000).abortOnVersionConflict(false).script(actualScript)
                            .filter(query)
                            .refresh(true).get();
                    if(response.getBulkFailures().size() > 0) {
                        for(BulkItemResponse.Failure failure : response.getBulkFailures()) {
                            logger.error("Failure : cause={} , message={}",failure.getCause(), failure.getMessage());
                        }
                    }
                    if(response.isTimedOut()) {
                        logger.error("Update By Query ended with timeout!");
                    }
                    if(response.getVersionConflicts() > 0) {
                        logger.warn("Update By Query ended with {} Version Conflicts!", response.getVersionConflicts());
                    }
                    if(response.getNoops() > 0) {
                        logger.warn("Update By Query ended with {} noops!", response.getNoops());
                    }
                }

I got the warn for version conflicts each time the first request is longer than the second one. If I put a break point and wait a little bit I got no conflict

Try to add at the beginning of your loop something like:

client.admin().indices().prepareRefresh(index).get();

And then remove .refresh(true).

Thanks a lot

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.