How to use ReindexRequestBuilder in combination with the Task API?

Hi,

I am trying to migrate some Elastic 2.3.4 code to the new Elastic 5.0.0 and I'm struggling with the Reindex API.

What I'm trying to achieve is the equivalent of this:
POST /_reindex?wait_for_completion=false
{"source":{"index":"old_index","size":100},"dest":{"index":"new_index"}}

And here is what I have so far:
ReindexRequestBuilder reindexRequestBuilder = ReindexAction.INSTANCE.newRequestBuilder(client);
reindexRequestBuilder.source(index).destination(newIndex);
reindexRequestBuilder.source().setSize(100);

The problem is that I don't see any way to set this wait_for_completion parameter using the API, meaning the request is blocking and there is no task created and therefore, I have no way to track the progress of the reindex job.

Can anyone point to the right direction?
Aymeric

There is no setter / option to set wait on complete on either the ReindexRequest or ReindexRequestBuilder. If you just want the behaviour of wait_for_completion=false is doing then you just invoke the reindex via the client in an async manner.

Executing in async manner:

ListenableActionFuture<? extends BulkIndexByScrollResponse> future = reindexRequestBuilder.execute();

Then you can get a reference to the tasks via the task list api:

client().admin().cluster()
   .prepareListTasks()
   .setActions(ReindexAction.NAME)
   .setDetailed(true)
   .get();

Then somewhere later you can wait and the response being returned like this:

BulkIndexByScrollResponse response = future.get();

Obviously if the reindex is quick you may not being able to actually get a hold of the tasks.

Thank you very much for the quick response, I'm getting one step closer :slight_smile:

The problem now is that the task list api only gives me the list of all reindex tasks. Assuming I have 2 reindex running at the same time, how can I know that task 1 is a reindex of index 1 and task 2 a reindex of index 3 (for example)?
That's the only piece missing for me to track the progress of long(ish) reindex.

Regards
Aymeric

We don't have a thing for that at this point. I mean, you can use the start time but it isn't very good. The REST API gets it right by using the task id. You can't do that because the task id doesn't flow back over the transport client. I'm fairly involved with reindex and the tasks APIs so I'll reach out to the tasks API author and see if we can brainstorm about this sometime.

You should probably set setShouldStoreResult(true) so when the reindex action is complete you get fetch the result using the task get API.

Thanks Nik. I'll make sure to read the patch notes of the upcoming releases then.