I have a large document that I am trying to reindex in-place using scan and scroll.
I've created a new index, and am doing a search on the old index with a type of scan, with a match all query, size of 600, and then bulk indexing that data from _source, into the new index.
After I hit 600k documents, it fails with a timeout.. The scroll time I am using is 5 minutes, everything I have read on here says you shouldn't need a long scroll time... It also says that each scroll should reset the timeout, which clearly is not happening so I'm super confused...
Anyway, then I read in the docs that you can clear the scroll... So, I change my loop to clear the previous scroll with each loop... But as soon as I try to clear the scroll after bulk indexing, I get
Elasticsearch::Transport::Transport::Errors::NotFound: [404] {"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czozMDAwOw==","took":1,"timed_out":false,"_shards":{"total":5,"successful":0,"failed":5,"failures":[{"status":404,"reason":"SearchContextMissingException[No search context found for id [121]]"},{"status":404,"reason":"SearchContextMissingException[No search context found for id [122]]"},{"status":404,"reason":"SearchContextMissingException[No search context found for id [123]]"},{"status":404,"reason":"SearchContextMissingException[No search context found for id [124]]"},{"status":404,"reason":"SearchContextMissingException[No search context found for id [125]]"}]},"hits":{"total":3000,"max_score":0.0,"hits":[]}}
What am I doing wrong?