Reindex api using a query and size

I have an index w 120 mil records (each record ~ 5k), and I want to use the Reindex api to create a new index w a subset of those records.

I've identified a subset of records (about 2 million) from a query and would like to use this query in the Reindex api call.

Is it best practice to move all records in a single command, or do I need to do some type of batching or throttling? I know there's a size parameter w the reindex api. If I set it to 10,000, will it move all 2 million records to the new index in batches of 10,000? Or just the first 10,000 records?

Thanks!

So you certainly can do the entire thing with a single request, however, if you want to have throttling (to lessen the impact of the reindexing), you will have to wait until 2.4.0 (see: https://github.com/elastic/elasticsearch/pull/18020).

You could do a subset with size, but there isn't an offset/from option, so you wouldn't be able to do a different "batch" of documents, you'd have to limit it artificially using the query somehow.

Thanks!