I am relatively new to Elasticsearch so I have been playing with a bit using the Python API. I am using Elasticsearch service through AWS so I have been following its suggestions for restoring snapshots.
Since AWS does not allow me to close indices I create a temporary index and then use reindex to copy the documents from the temporary index to the default index. Both indices (the default and temp) have different number of documents. I expected that after calling reindex that the temp and default would have the same number of documents but it doesn't happen.
So for example I have two indices
- temp_index with 1000 documents
- index with 900 documents
I expected after I reindexed temp_index to index that index would now have 1000 documents but it still shows 900. Is there something I am missing with how reindex works?
I wonder if you can supply information on the reindex request and response? Also, the version of Elasticsearch is always good information.
Are the numbers given the real numbers for doc counts in the two indexes or are they example numbers?
Thanks for the quick reply. I think I figured out what was happening or at least what I think is happening.
I was calling restore and reindex within the same function and I was getting some HTTP 503 errors. Sometimes there were some partial reindexing of the index and other times there was none. The python API seems to have a predetermined number of retries when the service is not available.
I found that if I wait a few seconds between restore and reindex everything works. So I am guessing that my Elasticsearch instance wasn't fully done with the restore when I called the reindex. Does this make sense, is my understanding correct?
BTW, I am using Elasticsearch 5.5
yes, that sounds like a plausible explanation. Restore by default runs async. You can use
wait_for_completion=true to make the request wait until the restore operation has completed.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.