Is it necessary to pass the scrollId timeout value in each subsequent search
scroll request?
As in the example at
http://www.elasticsearch.org/guide/reference/api/search/search-type.html
Is it necessary to pass the scrollId timeout value in each subsequent search
scroll request?
As in the example at
http://www.elasticsearch.org/guide/reference/api/search/search-type.html
On Wed, 2011-06-15 at 19:53 -0400, James Cook wrote:
Is it necessary to pass the scrollId timeout value in each subsequent
search scroll request?
The scroll timeout and the new _scroll_id from the previous search or
scroll request.
The final request will return zero hits, which is how you know that
you're done.
clint
Hi Clinton,
It doesn't seem to work that way. At least anecdotally, I don't have to pass
the timeout value in subsequent search scroll requests. Perhaps this means
that my "cursor" will time out based on the amount of time that has elapsed
from the initial scroll request, but I was checking to see if this is the
case.
I wish the last result did return 0 hits, but instead it is currently
(0.16.2) throwing an exception.
-- jim
On Wed, Jun 15, 2011 at 11:38 PM, Clinton Gormley
clinton@iannounce.co.ukwrote:
On Wed, 2011-06-15 at 19:53 -0400, James Cook wrote:
Is it necessary to pass the scrollId timeout value in each subsequent
search scroll request?The scroll timeout and the new _scroll_id from the previous search or
scroll request.The final request will return zero hits, which is how you know that
you're done.clint
Hi James
It doesn't seem to work that way. At least anecdotally, I don't have
to pass the timeout value in subsequent search scroll requests.
I wish the last result did return 0 hits, but instead it is currently
(0.16.2) throwing an exception.
Scroll search always throws IndexOutOfBoundsException on last iteration · Issue #1008 · elastic/elasticsearch · GitHub
Is the fact that you're not passing the timeout not the reason that you
are seeing the exception?
I use scrolling as I described, and it works without any errors. Note:
if you don't pass the timeout, you may still see a few successful scroll
results, but it won't last
clint
I attached a simple gist to the issue to recreate. If it succeeds for you,
perhaps there is a config parameter which is different between our set ups
which has some impact on the results?
So, the timeout value needs to be constantly supplied on each successive
search. I can't think of a use case where that is a helpful feature.
Couldn't each time a scroll_id is referenced, it updates its TTL with the
original value. I suppose passing the timeout each time only makes sense if:
a) The TTL needs to change while retrieving pages of results, or
b) ES doesn't have a way of knowing what the original TTL was.
Either way, it is a simple workaround even though it is a bit strange.
-- jim
On Thu, Jun 16, 2011 at 2:50 AM, Clinton Gormley clinton@iannounce.co.ukwrote:
Hi James
It doesn't seem to work that way. At least anecdotally, I don't have
to pass the timeout value in subsequent search scroll requests.I wish the last result did return 0 hits, but instead it is currently
(0.16.2) throwing an exception.
Scroll search always throws IndexOutOfBoundsException on last iteration · Issue #1008 · elastic/elasticsearch · GitHubIs the fact that you're not passing the timeout not the reason that you
are seeing the exception?I use scrolling as I described, and it works without any errors. Note:
if you don't pass the timeout, you may still see a few successful scroll
results, but it won't lastclint
Hi James
On Thu, 2011-06-16 at 03:15 -0400, James Cook wrote:
I attached a simple gist to the issue to recreate. If it succeeds for
you, perhaps there is a config parameter which is different between
our set ups which has some impact on the results?
I've got no special config
Here's an example of a scrolled search which works for me on 0.16.2:
clint
Thanks clinton. Were you able to duplicate my recreation and the exception?
On Jun 16, 2011 3:31 AM, "Clinton Gormley" clinton@iannounce.co.uk wrote:
Hi James
On Thu, 2011-06-16 at 03:15 -0400, James Cook wrote:
I attached a simple gist to the issue to recreate. If it succeeds for
you, perhaps there is a config parameter which is different between
our set ups which has some impact on the results?I've got no special config
Here's an example of a scrolled search which works for me on 0.16.2:
clint
Hi James
On Thu, 2011-06-16 at 09:44 -0400, James Cook wrote:
Thanks clinton. Were you able to duplicate my recreation and the
exception?
I don't use the Java API I'm afraid (or Java for that matter), so no
but if you look through the curl script that i linked to, you can check
that you're doing the same steps that i did, and if there is a
difference, then that's probably where the issue is.
if there isn't, well then it may be a bug
clint
clint
It's not a Java recreation. It's Curl.
It's very simple:
curl -XPOST 'http://localhost:9200/twitter/tweet/1' -d '{ "user": "kimchy"
}'
curl -XGET 'localhost:9200/_search?search_type=scan&scroll=5m&pretty=true'
-d '{ "query" : { "term": {"user":"kimchy"} } }'
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''
On Thu, Jun 16, 2011 at 10:21 AM, Clinton Gormley
clinton@iannounce.co.ukwrote:
Hi James
On Thu, 2011-06-16 at 09:44 -0400, James Cook wrote:
Thanks clinton. Were you able to duplicate my recreation and the
exception?I don't use the Java API I'm afraid (or Java for that matter), so no
but if you look through the curl script that i linked to, you can check
that you're doing the same steps that i did, and if there is a
difference, then that's probably where the issue is.if there isn't, well then it may be a bug
clint
clint
Sorry - I'm flu ridden - I missed that:
get scrollID
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''returns 1 hit
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''
Which scroll ID are you passing to the last statement? The scroll ID
from the search? Or the scroll ID from the previous scroll request?
It should be the latter
clint
James I will check you case. Providing the scroll parameter (with the timeout) means that you want to continue scrolling the request. Not passing it means that you don't. When scrolling, you need to make sure that you pass the scroll id you got from the previous response to the next scroll request.
On Thursday, June 16, 2011 at 6:53 PM, Clinton Gormley wrote:
Sorry - I'm flu ridden - I missed that:
get scrollID
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''returns 1 hit
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''
Which scroll ID are you passing to the last statement? The scroll ID
from the search? Or the scroll ID from the previous scroll request?It should be the latter
clint
Thanks Shay and Clinton. That must be the problem. I have been using the
scroll_id from the very first "setup" request to make subsequent requests.
I'll add a comment to the issue if this is the case.
--- jim
James I will check you case. Providing the scroll parameter (with the
timeout) means that you want to continue scrolling the request. Not passing
it means that you don't. When scrolling, you need to make sure that you pass
the scroll id you got from the previous response to the next scroll request.On Thursday, June 16, 2011 at 6:53 PM, Clinton Gormley wrote:
Sorry - I'm flu ridden - I missed that:
get scrollID
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''returns 1 hit
curl -GET 'localhost:9200/_search/scroll?scroll=5m&pretty=true' -d
''Which scroll ID are you passing to the last statement? The scroll ID
from the search? Or the scroll ID from the previous scroll request?It should be the latter
clint
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.