With the Java
Client.prepareSearchScroll() APIs, we can query an Elasticsearch index using the scrolls as mentioned in the documentation. With these APIs, we can select only a specific number of hits per request by setting
SearchRequestBuilder.setSize() . The
SearchResponse provides the scroll Id, which is then used in the subsequent request.
How can one use elasticsearch-spark to implement a similar functionality ? All
JavaEsSpark.esRDD() methods return
JavaPairRDD , which would contain all hits. Is there a way to request only a specific number of hits per request and then continue scrolling with further request?
I found the configuration
es.scroll.size , which seems equivalent to
SearchRequestBuilder.setSize() but I am not sure how to use it and how the scroll ids would be used in the context of elasticsearch-spark?