Rest High Level Client - 6.8.13 - long reindex api call - timeout

Hello,

We are upgrading existing application from TransportClient to RestHighLevelClient in 6.8.13 (part of 5.6 to 6.8 upgrade) and facing some issues regarding reindex api and timeout.

We are doing index copy throw re-index operation and it may take quite long time (some minutes up to few hours on big indexes).

In former code the current thread blocks until indexing is done :

ReindexRequestBuilder builder =
    ReindexAction.INSTANCE
        .newRequestBuilder(client)
        .source(from)
        .destination(to)
        .abortOnVersionConflict(true)
        .setRequestsPerSecond(10000);
BulkByScrollResponse result = builder.get();

Naïve adaptation lead to :

final ReindexRequest reindexRequest =
     new ReindexRequest()
         .setSourceIndices(from)
         .setDestIndex(to)
         .setAbortOnVersionConflict(true)
         .setRequestsPerSecond(10000)
BulkByScrollResponse result = client.reindex(reindexRequest, RequestOptions.DEFAULT);

But the call generate timeout :
Caused by: java.io.IOException: listener timeout after waiting for [90000] ms

I can set high timeout value during client creation :

clientBuilder
    .setRequestConfigCallback(
        requestConfigBuilder -> requestConfigBuilder.setSocketTimeout(timeout))
    // If not set this timeout is lower than socket one, connection will end according to this settings
    // Forcing value to socket timeout until removed
        .setMaxRetryTimeoutMillis(timeout);

I was able to run the whole process but this look like invalid solution to me.
The client is shared across the whole application and those timeout seem to high to be applied on standard queries, i may create a dedicated long timeout client.

Another way to workaround the timeout would be do real submitReindexTask call creating a task and polling its state until it ends then grabbing the result but this seem to generate too much boilerplate code to be the way to go.
Moreover i don't really see how i would get the result from the task

// sample untested code to show concept 
final TaskSubmissionResponse reindexSubmission = elasticSearchClientProvider.getClient()
    .submitReindexTask(reindexRequest, RequestOptions.DEFAULT);
final TaskId taskId = new TaskId(reindexSubmission.getTask());
final GetTaskRequest getTaskRequest = new GetTaskRequest(taskId.getNodeId();

GetTaskResponse taskStatus;
do {
  // need to wait somewhere around here :
  Thread.sleep(pollDelay);
  taskStatus =
      elasticSearchClientProvider
          .getClient()
          .tasks()
          .get(getTaskRequest, RequestOptions.DEFAULT)
          .orElseThrow(() -> new IllegalStateException("Reindex task not found" + taskId));
} while(!taskStatus.isCompleted());

BulkByScrollResponse result = ???

Is there any other option that could help on long running operation through Rest High Level Client ?
If i want to proceed on task solution, how can I get back a BulkByScrollResponse from task api ?

Thanks for your help

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.