I have a very large data set spread over multiple indexes that I want to
basically grab each record/transform into another index. Reading the docs
points me towards scan & scroll and then some bulk indexing. What concerns
me is failure during this copy it seems there is no way to 'resume' this
job if it fails in the middle. T Based off some initial tests this copy
will take a long time to run and I wonder if I have overlooked some options
or I am not thinking of something. The only thing I can think of is
persisting scroll id and keeping them open for an extended period of time
but the downside being this will have strong negative impact on ES memory.
Yes, if you get an error while scan/scroll is active, you have to close the
procedure and restart from the beginning.
Not sure what you mean by an "extended period of time" but you can surely
keep the cursor open for some minutes without too much impact.
Jörg
On Wed, Dec 3, 2014 at 12:30 AM, Barrett Kern bskern@gmail.com wrote:
Hello,
I have a very large data set spread over multiple indexes that I want to
basically grab each record/transform into another index. Reading the docs
points me towards scan & scroll and then some bulk indexing. What concerns
me is failure during this copy it seems there is no way to 'resume' this
job if it fails in the middle. T Based off some initial tests this copy
will take a long time to run and I wonder if I have overlooked some options
or I am not thinking of something. The only thing I can think of is
persisting scroll id and keeping them open for an extended period of time
but the downside being this will have strong negative impact on ES memory.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.