In Dev Tools Console, _reindex gets error of Client request timeout

Mike_Z · July 7, 2023, 10:18pm

When trying to duplicate an index by the _reindex command, it always gets an error saying "502, Bad Gateway" and "Client request timeout".

However, the command GET _cat/indices/_all shows, right after the above error, that the destination index exists.

We noticed that the resulted destination index is slightly different in size compared with the source index. The difference was 158.8mb vs. 149.5mb.

We tried the _reindex command on a smaller source index before, and it successfully finished without any error, and the size appeared equal between the source and destination indices. The smaller index before was 80.8mb in size.

The sample commands and outputs:

POST /_reindex
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "dest"
  }
}
# Result:
# {
#   "statusCode": 502,
#   "error": "Bad Gateway",
#   "message": "Client request timeout"
# }

GET _cat/indices/_all
# Result:
# yellow open dest            YUBT2IhRQGSnnIFPrtaI_Q 1 1 248013 0 158.8mb 158.8mb
# yellow open my-index-000001 J_ttsV5kRrSnX5_iufQD-Q 1 1 257303 0 149.5mb 149.5mb

Our Questions:

Why did the _reindex command get the timeout error in the above test case?
Why the size of the destination index got different from the source?

We are new to Elasticsearch, and we highly appreciate any hints and suggestions.

stephenb · July 8, 2023, 12:27am

Hi @Mike_Z you are probably timing out so perhaps try.

wait_for_completion=false

Which is really what you want to do anyways. You will get a task id which then you can check the status using the task API

Although 150mb is pretty small so perhaps some other reason but that is what I would try first.

You are not guaranteed the resulting index will be the exact same size in bytes as the source as they may have a different number of segments etc.

Unless you force merge both down to 1 segment after everything is done will they be the same size.

I would look closely at the documents as there are many options to optimize reindex for large indices

Mike_Z · July 10, 2023, 4:06pm

Thank you, @stephenb .

We tried the following command, and it returned right away, and the timeout did not appear.

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "sets-no-wait"
  }
}

After return, the index copying was still in progress, so by the command GET _cat/indices/_all, we saw the destination index keep increasing in size until done.

Mike_Z · July 10, 2023, 4:22pm

By the way, I have a related question to follow up.

I think the timeout happens on the Kibana, right? If yes, where can we find the log events?

So, I logged into the container by the command docker exec -u root -it kibana /bin/bash, and looked for the log files in /var/log/, but did not find anything.

root@51b2363988cb:/var/log# tree
.
|-- alternatives.log
|-- apt
|   |-- eipp.log.xz
|   |-- history.log
|   `-- term.log
|-- bootstrap.log
|-- btmp
|-- dpkg.log
|-- faillog
|-- fontconfig.log
|-- lastlog
`-- wtmp

1 directory, 11 files

stephenb · July 10, 2023, 4:38pm

The Docker container writes to stdout I believe... so you need to look there...

system · August 7, 2023, 4:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Timeout Error using the ReIndex API Elasticsearch	2	25792	April 17, 2018
Reindexing Timeout Elasticsearch	1	1604	August 14, 2017
Reindex big index Elasticsearch	2	9376	May 18, 2017
Timeout issue when using reindex API Elasticsearch	1	2856	January 1, 2020
[SOLVED] Elasticsearch python API and reindex module Elasticsearch	11	3193	July 5, 2017

In Dev Tools Console, _reindex gets error of Client request timeout

Related topics