the second one's response contains version conflicts. if I wait few seconds between both, I got a coherent result without version conflict. Is there a better way to do this or a refresh that make the updates of the first query visible.
.refresh(true) on the first one should make it visible to the second one.
A less intrusive option is to use refresh(RefreshPolicy.WAIT_FOR) but that isn't supported by update by query. I just never got around to supporting it, mostly because I wasn't really sure of the "right" way to do it.
BTW, .setRequestsPerSecond(100000F) is pretty much the same as not setting it at all.
Hi Nik,
Thx for your answer, the problem is even when I use .refresh(true) I can't find a way to tell the second query to wait for the first one to finish the refresh, in my case the first query is much heavier than the second one. I changed the partial update with script to update by query because I didn't find a way to solve this. So I really need the second query to wait for the refresh of the first one to finish in elasticsearch. I don't know if I can check the information "My Index has a refresh in progress" and wait while it's true before sending the second on, is it possible?
When you set refresh(true) elasticsearch will only response after the refresh is done.
If the second query is in the same thread after the first one, it should work.
for(int i=0; i<scripts.length; i++) {
Script actualScript = new Script(ScriptType.INLINE, "painless", scripts[i], scriptParams[i]);
UpdateByQueryRequestBuilder ubqrb = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
ubqrb.source(index).source().setTypes(itemType);
BulkIndexByScrollResponse response = ubqrb.setSlices(2)
.setMaxRetries(1000).abortOnVersionConflict(false).script(actualScript)
.filter(query)
.refresh(true).get();
if(response.getBulkFailures().size() > 0) {
for(BulkItemResponse.Failure failure : response.getBulkFailures()) {
logger.error("Failure : cause={} , message={}",failure.getCause(), failure.getMessage());
}
}
if(response.isTimedOut()) {
logger.error("Update By Query ended with timeout!");
}
if(response.getVersionConflicts() > 0) {
logger.warn("Update By Query ended with {} Version Conflicts!", response.getVersionConflicts());
}
if(response.getNoops() > 0) {
logger.warn("Update By Query ended with {} noops!", response.getNoops());
}
}
I got the warn for version conflicts each time the first request is longer than the second one. If I put a break point and wait a little bit I got no conflict
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.