Java API corresponding to update_by_query

Update-by-query is part of the reindex module which is transparent to REST users but not java users, sadly. Modules are just plugins that ship inside of Elasticsearch's distribution in the modules directory.

To use it with the java client you need to do three things:

  1. Declare it as a dependency. In maven that'd look like:
<dependency>
    <groupId>org.elasticsearch.module</groupId>
    <artifactId>reindex</artifactId>
    <version>2.3.2</version>
</dependency>

If you don't like maven I suspect you can translate.

  1. When building the client you need to register the plugin:
clientBuilder.addPlugin(ReindexPlugin.class);
  1. Finally, instead of the normal prepareFoo methods you have to use the action directly:
UpdateByQueryRequestBuilder u = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
BulkIndexByScrollResponse r = u.source("source").filter(matchQuery("field", "value")).get();

The response has two lists of failures and status member you can use get counts. The status is exactly the same status as you can get from the task management API.

I love that reindex is a plugin because it demonstrates that you can write a pretty decent sized chunk of functionality as a plugin. It also makes the development process easier on me. But I'm sad that it has this effect on the java api.