Java API corresponding to update_by_query

2.3 added a great piece of functionality that we're going to use for bulk updating a field in an index with a script. We currently use the Java API but I just can't seem to find the code that corresponds to the update_by_query http api endpoint. Does that exist ? If so, which class(es) is it in?

Update-by-query is part of the reindex module which is transparent to REST users but not java users, sadly. Modules are just plugins that ship inside of Elasticsearch's distribution in the modules directory.

To use it with the java client you need to do three things:

  1. Declare it as a dependency. In maven that'd look like:
<dependency>
    <groupId>org.elasticsearch.module</groupId>
    <artifactId>reindex</artifactId>
    <version>2.3.2</version>
</dependency>

If you don't like maven I suspect you can translate.

  1. When building the client you need to register the plugin:
clientBuilder.addPlugin(ReindexPlugin.class);
  1. Finally, instead of the normal prepareFoo methods you have to use the action directly:
UpdateByQueryRequestBuilder u = UpdateByQueryAction.INSTANCE.newRequestBuilder(client);
BulkIndexByScrollResponse r = u.source("source").filter(matchQuery("field", "value")).get();

The response has two lists of failures and status member you can use get counts. The status is exactly the same status as you can get from the task management API.

I love that reindex is a plugin because it demonstrates that you can write a pretty decent sized chunk of functionality as a plugin. It also makes the development process easier on me. But I'm sad that it has this effect on the java api.

Lol markdown made three lists instead of one. You get the idea though.

Thanks, much appreciated!

When I try to use this (via maven dependency) it looks like it is nullpointing in the TransportProxyClient because the action was not registered in the TransportProxyClient proxies. Is there some additional step we have to do?

Nevermind. The numbers of the steps threw me off. I read the clientBuilder step as being the replacement to the maven before re-reading it and realizing my mistake.

OK one more note. In practice I notice I had to explicitly specify the 'indices' of the SearchRequest via:
updateByQueryBuilder.request().getSearchRequest().indices(indexNameOrAlias);

Otherwise I got a nullpointer inside the SearchRequest handling:
java.lang.NullPointerException
at org.elasticsearch.action.search.SearchRequest.writeTo(SearchRequest.java:580)

Is there an easier way to specify the index to target?

1 Like