I use sliced scroll api to fetch results from Elasticsearch in parallel (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#sliced-scroll). But I am not able to set Elasticsearch to return first milion only. Results count cab be reduced using parameter "min_score", but it's not way I want. Is it possible to set results count even if sliced scroll method is used?
terminate_after (bottom of this page: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) might work with scrolling, I'm not sure
But why not just record the number of hits that you've scrolled client-side and stop when you get to 1m?
Thanks, I will try "terminate_after".
Yes, I tried this method. But when using "sliced scroll" results seems to be received in random order among slices. So when I stop after 1m received results, it's not sure I received 1m highest score results.
Example: I'm fetching results using 5 slices with size 1000. I supposed that slice 0 will receive 1000 highest score results. But it's not sure. I tested it.
Ok, I solved this problem. I control results count fetched with each slice and I stop fetching when defined count per slice is reached.
I don't know how to use "terminate_after" param exactly. It stops fetching when defined count is reached, but received results have not highest score. So it's useless for me.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.