Get all documents matching a list of values for a field

jmaharjan · December 1, 2020, 8:19pm

I am using multisearch (msearch from elasticsearch-py) to search for all documents using a list of document ids from a list of indices. The id's are explicitly assigned. The goal is to look for existing documents (using a list of ids) and update documents that already exist and create a new index if the document does not exist. Since there could be more than 10K ids to search for, I use this piece of code for the search.

        results = []
        chunks = [list_with_ids[x:x+10000] for x in range(0, len(list_with_ids), 10000)]
        for chunk in chunks:
            if len(chunk)>0:
                request = []
                for _, index in enumerate(list_of_indices):
                    req_head = {'index': index}
                    req_body = {
                        "size":10000,
                        "query": {
                            "ids": {
                            "values": chunk
                            }   
                         }, 
                     }
                        
                    request.extend([req_head, req_body])
                    try:
                        result = client.msearch(body=request)
                    except:
                        continue
                    for response in result['responses']:
                        results.append(response)

The query seems to run fine for the most part but, sometimes it seems like the query does not return all the documents that match the query -- this makes it as if the document with some ids do not already exist on elasticsearch and the remaining part of my code (not shown here) creates a new document for it, thus creating documents. How can we ensure that all matching documents are returned?

system · December 29, 2020, 8:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Get all ids with Python Elasticsearch	2	318	November 24, 2023
Best way to create a list of all _ids in an index (Up to date version) Elasticsearch	2	351	October 1, 2021
ElasticSearch missing documents by list of id's as size of list increases Elasticsearch	1	717	August 29, 2019
Document exist check Elasticsearch	5	2956	July 5, 2017
Updating all documents using a dictionary as input Elasticsearch painless	6	2225	March 1, 2022

Get all documents matching a list of values for a field

Related topics