Hey so basically I'm using elastic search to retrieve a lot of data fast in order to do mapped calculations on it on the server. I'm planning to prepare for millions to tens of millions having to be loaded at once.
I came across the scan function in python, so I do the scan on each shard as well as split them into separate processes.
However I still would like to put a threshold on this in case it ever reaches to hundreds of millions and would just like to get a data sample size.
Please let me know if this feature exists as I can't seem to find any documentation. So far elastic search is exactly what I need and am just missing this small feature.