Need help with scan/scroll using elasticsearch-py client

I'm using elasticsearch 5.2.1 and elasticsearch-py 5.2.0. My index.max_result_window is set to the default (10,000).

I'd like to have the option in my script to return all matching documents for a query. When I execute my script, I consistently get a result like this: <generator object scan at 0x00B5CE40> instead of the dict I would expect. I know my query is fine, as I can return the first 10,000 results using es.search without an error.

All that said, I'm a relative noob, so any and all help is appreciated.

My code looks like this (simplified for brevity):

from elasticsearch import Elasticsearch, helpers

es = Elasticsearch('hostname', port=9200)

res = helpers.scan(
                client = es,
				scroll = '2m',
                query = {"query":{"bool":{"must": [{"query_string": {"query": escaped_query }},
                        {"range":{"@timestamp":{"gte": from_date, "format": "basic_date"}}}]}}}, 
                index = "custom_data*")

print(res)
1 Like

Just updating my own post for anyone else looking for help on this. So the "scan" helper returns a Python generator object (didn't realize that was a thing). You can use the generator object that's returned to iterate over all matches. The code should look something like this:

from elasticsearch import Elasticsearch, helpers

es = Elasticsearch('hostname', port=9200)

res = helpers.scan(
                client = es,
				scroll = '2m',
                query = {"query":{"bool":{"must": [{"query_string": {"query": escaped_query }},
                        {"range":{"@timestamp":{"gte": from_date, "format": "basic_date"}}}]}}}, 
                index = "custom_data*")

for i in res:
    print(i)
8 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.