Get all documents matching a list of values for a field

I am using multisearch (msearch from elasticsearch-py) to search for all documents using a list of document ids from a list of indices. The id's are explicitly assigned. The goal is to look for existing documents (using a list of ids) and update documents that already exist and create a new index if the document does not exist. Since there could be more than 10K ids to search for, I use this piece of code for the search.

        results = []
        chunks = [list_with_ids[x:x+10000] for x in range(0, len(list_with_ids), 10000)]
        for chunk in chunks:
            if len(chunk)>0:
                request = []
                for _, index in enumerate(list_of_indices):
                    req_head = {'index': index}
                    req_body = {
                        "size":10000,
                        "query": {
                            "ids": {
                            "values": chunk
                            }   
                         }, 
                     }
                        
                    request.extend([req_head, req_body])
                    try:
                        result = client.msearch(body=request)
                    except:
                        continue
                    for response in result['responses']:
                        results.append(response)
                    

The query seems to run fine for the most part but, sometimes it seems like the query does not return all the documents that match the query -- this makes it as if the document with some ids do not already exist on elasticsearch and the remaining part of my code (not shown here) creates a new document for it, thus creating documents. How can we ensure that all matching documents are returned?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.