Elasticsearch Result Size


(Siv) #1

Hi

I'm using Python Library to integrate with Elasticsearch

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
res = es.search(index="college", doc_type='people', body=doc)

It returns only 10 records where as I have 30K+ matching records

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
res = es.search(index="college", doc_type='people', body=doc, size = 30450)

This retrieves all records. But I want to retrieve all the matching records without specifying size parameter (I might know it upfront). How to get it?


(Pemontto) #2

You want to do a scan and scroll search. I've just had to do the same thing in python, it might look something like this for you

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
res = es.search(index="college", doc_type='people', body=doc, scroll='60s', search_type='scan')

results = []
scroll_size = res['hits']['total']

while (scroll_size > 0):
    try:
        scroll_id = res['_scroll_id']
        res = es.scroll(scroll_id=scroll_id, scroll='60s')
        results += res['hits']['hits']
        scroll_size = len(res['hits']['hits'])
    except: 
        break

You could also look at using the scan helper


(system) #3