Clearing a scroll during scan and scroll?

Hi,

You've probably seen this URL, however you may want to take a look at it again:

https://www.elastic.co/guide/en/elasticsearch/guide/1.x/scan-scroll.html

Initially when you create the scroll you specify a time with scroll=1m for example. Subsequent request using the scroll id also needs to have the same scroll=1m specified. This keeps the scroll open for an additional minute. As per the docs:

"Note that we again specify ?scroll=1m. The scroll expiry time is refreshed every time we run a scroll request, so it needs to give us only enough time to process the current batch of results, not all of the documents that match the query."

Lastly, for each subsequent scroll request you need to provide the new scroll id returned by the previous scroll request:

"The scroll request also returns a new _scroll_id. Every time we make the next scroll request, we must pass the _scroll_id returned by the previous scroll request."

So I would take a look if you're using this behavior for retrieving the results. You can use logstash to re-index your data. Take a look at this blog:

http://david.pilato.fr/blog/2015/05/20/reindex-elasticsearch-with-logstash/