SearchSourceBuilder.from accepts an integer

kieran · November 3, 2017, 11:20am

Hi

The Java class SearchSourceBuilder has a method 'from', which takes an int. So if I'm paginating I will set this value. However, the SearchResponse has a method .getHits().getTotalHits() which returns a long.

When dealing with results sets larger than Integer.MAX_VALUE, how should I request a page with a starting element greater than Integer.MAX_VALUE?

Thanks
Kieran

dadoonet · November 3, 2017, 11:42am

You can't do that.

from+size is limited by default to 10000.

kieran · November 3, 2017, 11:44am

How would I paginate on a result set that is greater than 10000?

dadoonet · November 3, 2017, 11:46am

Are you sure a user would like to get back the "less" relevant result?

If you want to extract all the resultset, you can look at the scroll API. https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-scroll.html
If you want to do deep pagination, you can look at search after: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-search-after.html

But you can't access directly to page 134512 for example.

kieran · November 3, 2017, 1:59pm

The index I am designing is for transactions. So it's conceivable that I want to get all transactions for a user, where that user has more than 10000 transactions, and a front end client wants to paginate through that data. So it's not about relevance. Should there be another way of me paginating through this kind of data?

I had a look at the 'search-after' functionality but it's not ideal to have to ask the front end to supply the last transaction from the previous page.

dadoonet · November 3, 2017, 2:15pm

You can increase the default limit of 10000 hits but you really need to understand the consequences.

Look at index.max_result_window in https://www.elastic.co/guide/en/elasticsearch/reference/5.6/index-modules.html#dynamic-index-settings

kieran · November 3, 2017, 2:22pm

I see. I imagine it would be rare for our transaction searches to contain more than 10000, but we would need to support these use cases.

Does increasing index.max_result_window affect the performance of all search queries?

What happens if max_result_window is 10000 and page 1 of a search containing more than 10000 total results is requested? Would an error be received?

dadoonet · November 3, 2017, 4:02pm

Does increasing index.max_result_window affect the performance of all search queries?

If you are not using it (like with size=10, from=0), no.

But, the problem is not only performance. My concern is more about the memory usage and the risk you take of producing out of memory exceptions.

What happens if max_result_window is 10000 and page 1 of a search containing more than 10000 total results is requested?

Yes.

I imagine it would be rare for our transaction searches to contain more than 10000, but we would need to support these use cases.

I believe you have something like a date of transaction in your dataset.
Just sort by date ascending or descending and then you don't need to do deep pagination...
Add some faceted navigation and you're done!

I really think that users should never have to go to last page.

But may be your users are asking themselves other questions that you can solve differently with elasticsearch. Aggregations is the key IMHO.

system · December 1, 2017, 4:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.