Scroll and scan issue, no new scroll_id generated

abshkd · August 6, 2012, 10:35pm

Hi
I am stuck on a weird problem.

I am running a text_phrase query over a range of timestamps. this query is
perfectly fine. I want to download the entire result (some 1.5 million
documents) so I used scan + scroll_id method.

This is my Curl

curl -XGET
'remotehost:9200/graylog2/messsage/_search?search_type=scan&scroll=10m&size=10'
-d '
{
"query":{"bool":{"must":[{"text":{"message":"this is an
error"}},{"range":{"created_at":{"from":"1341597072","to":"1344275472"}}}]}}
}

I get the scrollid in response as we should.

next step: use the scroll id on the _Search endpoint

curl -XGET 'remotehost:9200/_search/scroll?scroll=10m' -d
'c2Nhbjs1OzgwOnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7Nzk6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTs3ODp0cTJ4bDRoblRWcXhNMVE0cjd6YmJROzc3OnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7NzY6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTsxO3RvdGFsX2hpdHM6MTIwNDQ3NTs='

However the response gives me back the same scroll id in the json
['_scroll_id']

So what happens is that I can make one more query with the above scroll id
after which it will fail.

{"error":"ArrayIndexOutOfBoundsException[174]","status":500}

I am not sure what I am doing wrong here, please help?

thank you

kimchy · August 7, 2012, 9:27pm

Are you using the scroll id from the previous response as the one to use for the next request?

On Aug 7, 2012, at 12:35 AM, abshkd abhishek.dujari@gmail.com wrote:

Hi
I am stuck on a weird problem.

I am running a text_phrase query over a range of timestamps. this query is perfectly fine. I want to download the entire result (some 1.5 million documents) so I used scan + scroll_id method.

This is my Curl

curl -XGET 'remotehost:9200/graylog2/messsage/_search?search_type=scan&scroll=10m&size=10' -d '
{
"query":{"bool":{"must":[{"text":{"message":"this is an error"}},{"range":{"created_at":{"from":"1341597072","to":"1344275472"}}}]}}
}

I get the scrollid in response as we should.

next step: use the scroll id on the _Search endpoint

curl -XGET 'remotehost:9200/_search/scroll?scroll=10m' -d 'c2Nhbjs1OzgwOnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7Nzk6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTs3ODp0cTJ4bDRoblRWcXhNMVE0cjd6YmJROzc3OnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7NzY6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTsxO3RvdGFsX2hpdHM6MTIwNDQ3NTs='

However the response gives me back the same scroll id in the json ['_scroll_id']

So what happens is that I can make one more query with the above scroll id after which it will fail.

{"error":"ArrayIndexOutOfBoundsException[174]","status":500}

I am not sure what I am doing wrong here, please help?

thank you

Topic		Replies	Views
Scroll Search Bug? Elasticsearch	4	2587	July 6, 2017
Confused about why scroll api doesn't seem to function Elasticsearch	7	839	July 5, 2017
Do unique/reusable _scroll_ids exist? Elasticsearch	4	1511	July 6, 2017
Elasticsearch 2.3.3 Scroll result doesnt deliver hits Elasticsearch	3	1114	July 5, 2017
Scan and scroll in 1.x Elasticsearch	3	778	March 16, 2017

Scroll and scan issue, no new scroll_id generated

Related topics