Reindex task completed but no response was returned

naoko · November 18, 2018, 1:18pm

I reindex the index that has about 9 million doc
_cat/indices shows 9 million but _count shows 3.6 million.
docs_deleted count in _cat/indices shows less than 40k so my first question is why there is so many discrepancies.
Once reindex is started, I verified that task got created and processing. When the document count on new index is reaching very close to what shows in _count, the task is no longer listed. I assume the task is completed with no issue. However, after the task is gone, I waited about 40+ minutes but got no response back from reindex API. Document count on new index is not increasing at all but actually 50+ short from what I expected (i later on verified by query that indeed the 50+ docs are missing from new index).
Since task is gone and docs doesn't seems to be moving at all, I canceled the API call.
I would like to know if anyone have insight of what might have happened and would like to have advice on what I can do if this happened next time.

version 5.5.2

Thank you.

nik9000 · November 18, 2018, 1:45pm

The task will vanish once the reindex stops unless you set wait_for_completion=false. As for why you never got a response back, so you happen to access elasticsearch through a proxy? If so it might have chopped the connection without telling you about it. Another possibility is that the node running the reindex stopped. Or something more interesting happened. Those things should all show up in the logs. It is certainly possible that there is a bug here, but I'd check the logs first. If nothing is in the logs I'd hunt a proxy.

naoko · November 18, 2018, 3:54pm

Thank you @nik9000 for a quick response.
First I re-reviewed the task information and all the docs it says it would move was actually moved. 50+ documents missing is something that was added AFTER reindex command is issued so my fault.
I did set wait_for_completion=true.
Thank you for leading me to the log. I did not see any error. All I saw was index created when I started the reindex then there are series of updating mapping.
So now I'm sure that elasticsearch completed the task without any issue.
The only thing is that I did not get response back and I do not use proxy. I used master node's IP address to made a request.
I set timeout to be about 4 hours but actual task completed in 1.5 hours.
Perhaps checking task status is most accurate than awaiting for completion over http.
For such long tasks, would you recommend to not to wait for completion and rather check task status to know when the task is completed?

nik9000 · November 18, 2018, 4:31pm

Yes, for long tasks I recommend setting g wait_foe_comletion to false and polling for it.

naoko · November 18, 2018, 4:34pm

Thank you very much for your advice.

system · December 16, 2018, 4:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindex - different doc count in the new index Elasticsearch	1	770	July 11, 2018
My reindex task disappears somehow after it is created Elasticsearch reindex	2	17	September 26, 2024
Missing documents after _reindex of daily indices Elasticsearch	4	2252	April 19, 2018
Strange reindex behaviour Elasticsearch reindex	1	403	June 23, 2021
Debugging Partially Complete Reindexing Task? Elasticsearch	2	778	February 6, 2019

Reindex task completed but no response was returned

Related topics