I reindex the index that has about 9 million doc _cat/indices shows 9 million but _count shows 3.6 million. docs_deleted count in _cat/indices shows less than 40k so my first question is why there is so many discrepancies.
Once reindex is started, I verified that task got created and processing. When the document count on new index is reaching very close to what shows in _count, the task is no longer listed. I assume the task is completed with no issue. However, after the task is gone, I waited about 40+ minutes but got no response back from reindex API. Document count on new index is not increasing at all but actually 50+ short from what I expected (i later on verified by query that indeed the 50+ docs are missing from new index).
Since task is gone and docs doesn't seems to be moving at all, I canceled the API call.
I would like to know if anyone have insight of what might have happened and would like to have advice on what I can do if this happened next time.
The task will vanish once the reindex stops unless you set wait_for_completion=false. As for why you never got a response back, so you happen to access elasticsearch through a proxy? If so it might have chopped the connection without telling you about it. Another possibility is that the node running the reindex stopped. Or something more interesting happened. Those things should all show up in the logs. It is certainly possible that there is a bug here, but I'd check the logs first. If nothing is in the logs I'd hunt a proxy.
Thank you @nik9000 for a quick response.
First I re-reviewed the task information and all the docs it says it would move was actually moved. 50+ documents missing is something that was added AFTER reindex command is issued so my fault.
I did set wait_for_completion=true.
Thank you for leading me to the log. I did not see any error. All I saw was index created when I started the reindex then there are series of updating mapping.
So now I'm sure that elasticsearch completed the task without any issue.
The only thing is that I did not get response back and I do not use proxy. I used master node's IP address to made a request.
I set timeout to be about 4 hours but actual task completed in 1.5 hours.
Perhaps checking task status is most accurate than awaiting for completion over http.
For such long tasks, would you recommend to not to wait for completion and rather check task status to know when the task is completed?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.