In later versions of elasticsearch we introduced the composite aggregation and terms agg partitioning to help break big requests like this one into smaller pieces.
Using 2.4 APIs you could look at using the scroll API, sorting docs by hash and stream them out to your client code to look for duplicates in the sequence of docs.