The Java class SearchSourceBuilder has a method 'from', which takes an int. So if I'm paginating I will set this value. However, the SearchResponse has a method .getHits().getTotalHits() which returns a long.
When dealing with results sets larger than Integer.MAX_VALUE, how should I request a page with a starting element greater than Integer.MAX_VALUE?
The index I am designing is for transactions. So it's conceivable that I want to get all transactions for a user, where that user has more than 10000 transactions, and a front end client wants to paginate through that data. So it's not about relevance. Should there be another way of me paginating through this kind of data?
I had a look at the 'search-after' functionality but it's not ideal to have to ask the front end to supply the last transaction from the previous page.
Does increasing index.max_result_window affect the performance of all search queries?
If you are not using it (like with size=10, from=0), no.
But, the problem is not only performance. My concern is more about the memory usage and the risk you take of producing out of memory exceptions.
What happens if max_result_window is 10000 and page 1 of a search containing more than 10000 total results is requested?
Yes.
I imagine it would be rare for our transaction searches to contain more than 10000, but we would need to support these use cases.
I believe you have something like a date of transaction in your dataset.
Just sort by date ascending or descending and then you don't need to do deep pagination...
Add some faceted navigation and you're done!
I really think that users should never have to go to last page.
But may be your users are asking themselves other questions that you can solve differently with elasticsearch. Aggregations is the key IMHO.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.