Scroll and scan issue, no new scroll_id generated


(abshkd) #1

Hi
I am stuck on a weird problem.

I am running a text_phrase query over a range of timestamps. this query is
perfectly fine. I want to download the entire result (some 1.5 million
documents) so I used scan + scroll_id method.

This is my Curl

curl -XGET
'remotehost:9200/graylog2/messsage/_search?search_type=scan&scroll=10m&size=10'
-d '
{
"query":{"bool":{"must":[{"text":{"message":"this is an
error"}},{"range":{"created_at":{"from":"1341597072","to":"1344275472"}}}]}}
}

I get the scrollid in response as we should.

next step: use the scroll id on the _Search endpoint

curl -XGET 'remotehost:9200/_search/scroll?scroll=10m' -d
'c2Nhbjs1OzgwOnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7Nzk6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTs3ODp0cTJ4bDRoblRWcXhNMVE0cjd6YmJROzc3OnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7NzY6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTsxO3RvdGFsX2hpdHM6MTIwNDQ3NTs='

However the response gives me back the same scroll id in the json
['_scroll_id']

So what happens is that I can make one more query with the above scroll id
after which it will fail.

{"error":"ArrayIndexOutOfBoundsException[174]","status":500}

I am not sure what I am doing wrong here, please help?

thank you


(Shay Banon) #2

Are you using the scroll id from the previous response as the one to use for the next request?

On Aug 7, 2012, at 12:35 AM, abshkd abhishek.dujari@gmail.com wrote:

Hi
I am stuck on a weird problem.

I am running a text_phrase query over a range of timestamps. this query is perfectly fine. I want to download the entire result (some 1.5 million documents) so I used scan + scroll_id method.

This is my Curl

curl -XGET 'remotehost:9200/graylog2/messsage/_search?search_type=scan&scroll=10m&size=10' -d '
{
"query":{"bool":{"must":[{"text":{"message":"this is an error"}},{"range":{"created_at":{"from":"1341597072","to":"1344275472"}}}]}}
}

I get the scrollid in response as we should.

next step: use the scroll id on the _Search endpoint

curl -XGET 'remotehost:9200/_search/scroll?scroll=10m' -d 'c2Nhbjs1OzgwOnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7Nzk6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTs3ODp0cTJ4bDRoblRWcXhNMVE0cjd6YmJROzc3OnRxMnhsNGhuVFZxeE0xUTRyN3piYlE7NzY6dHEyeGw0aG5UVnF4TTFRNHI3emJiUTsxO3RvdGFsX2hpdHM6MTIwNDQ3NTs='

However the response gives me back the same scroll id in the json ['_scroll_id']

So what happens is that I can make one more query with the above scroll id after which it will fail.

{"error":"ArrayIndexOutOfBoundsException[174]","status":500}

I am not sure what I am doing wrong here, please help?

thank you


(system) #3