Scan and Scroll - handling failure

bskern · December 2, 2014, 11:30pm

Hello,

I have a very large data set spread over multiple indexes that I want to
basically grab each record/transform into another index. Reading the docs
points me towards scan & scroll and then some bulk indexing. What concerns
me is failure during this copy it seems there is no way to 'resume' this
job if it fails in the middle. T Based off some initial tests this copy
will take a long time to run and I wonder if I have overlooked some options
or I am not thinking of something. The only thing I can think of is
persisting scroll id and keeping them open for an extended period of time
but the downside being this will have strong negative impact on ES memory.

Thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ce039ea-e44b-4ab6-acb0-ed7e4dfea35e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · December 4, 2014, 8:45am

Yes, if you get an error while scan/scroll is active, you have to close the
procedure and restart from the beginning.

Not sure what you mean by an "extended period of time" but you can surely
keep the cursor open for some minutes without too much impact.

Jörg

On Wed, Dec 3, 2014 at 12:30 AM, Barrett Kern bskern@gmail.com wrote:

Hello,

I have a very large data set spread over multiple indexes that I want to
basically grab each record/transform into another index. Reading the docs
points me towards scan & scroll and then some bulk indexing. What concerns
me is failure during this copy it seems there is no way to 'resume' this
job if it fails in the middle. T Based off some initial tests this copy
will take a long time to run and I wonder if I have overlooked some options
or I am not thinking of something. The only thing I can think of is
persisting scroll id and keeping them open for an extended period of time
but the downside being this will have strong negative impact on ES memory.

Thanks,
Barry

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ce039ea-e44b-4ab6-acb0-ed7e4dfea35e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ce039ea-e44b-4ab6-acb0-ed7e4dfea35e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH0w_xmS%2BbB1bFegOBvcqXv3euwobr-gDhZi_yMaVZi7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Resuming scroll request after intermediate failure Elasticsearch	10	3710	July 8, 2017
Resume scroll-scan query? Elasticsearch	5	1533	July 6, 2017
ScanError while scrolling more than 10k docs Elasticsearch	6	400	November 20, 2023
What happens when I post the same scrollId after some timeout failure Elasticsearch	1	348	July 6, 2017
Resuming Interrupted Reindexing Process Elasticsearch	1	423	July 6, 2017

Scan and Scroll - handling failure

Related topics