Max Window Size is Set to 10000 but the Terms aggregations is giving single filter value larger than max window size.

santhosh.linga · December 8, 2023, 4:33pm

We have created an index and are querying the index to display the complete dataset. However, we have encountered performance issues, as we are dealing with 1 million records in response to user search queries. To address this, we are making changes to show only the top 10,000 records and are exploring the possibility of applying filters and sorting exclusively to these 10,000 records if necessary.

For example:

The current scenario is as follows: When we search for "doctor" in the title field, we retrieve 800,000 records. Due to the impact on our servers, we are heeding Elasticsearch's recommendation to limit the results to the top 10,000.

My questions are:

Can we disregard other records if the count exceeds 10,000 for each different search?
Can we retrieve filter options for the top 10,000 records and apply filters for the top 10k only, is this achievable?
If sorting is required, can we limit it to the top 10,000 records only?

yago82 · December 9, 2023, 3:50pm

Hi,

Yes, you can limit your search results to the top 10,000 records. This is actually the default limit in Elasticsearch for a single query. If you need to retrieve more than 10,000 results, you would typically use the Scroll or Search After API, but in your case, it sounds like you want to avoid this due to performance concerns.

Here's an example of how you can limit your search results:

GET /your_index/_search
{
  "query": {
    "match": {
      "title": "doctor"
    }
  },
  "size": 10000
}

This will return the top 10,000 matches for "doctor" in the title field.

As for your questions about filtering and sorting, yes, you can apply filters and sorting to these 10,000 records. The filters and sorting will be applied at query time, so they will only affect the top 10,000 records that match your query.

Here's an example of how you can apply a filter and sort your results:

GET /your_index/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "title": "doctor"
        }
      },
      "filter": {
        "term": {
          "some_field": "some_value"
        }
      }
    }
  },
  "sort": [
    { "another_field": "asc" }
  ],
  "size": 10000
}

This will return the top 10,000 matches for "doctor" in the title field, that also have "some_value" in "some_field", sorted by "another_field" in ascending order.

dadoonet · December 10, 2023, 6:45am

Welcome!

I'm curious. Why a end user would need to see 10000 documents instead of just looking at the first ones?

santhosh.linga · December 11, 2023, 4:38pm

Thank you @yago82 and @dadoonet for your reply.

@yago82 we don't want to show all the 10k results at the same time we have the pagination feature.

Hope this way we can not use the size set to 10k? that is the reason we are using the max window size of 10K even though the default value is 10K it does not make any difference. But our main concern is

a. Ignore all the results above 10k
b. get the filters or apply sort only for the first 10k.

@yago82 and @dadoonet Please let me know if you haven't got my use case.

santhosh.linga · December 19, 2023, 6:46pm

We are providing the info for all over the USA. If the user requires to see only VA info we have filters but for default we provide complete USA info. That is the reason we are seeing huge numbers of result sets.

dadoonet · December 19, 2023, 7:20pm

And what a user is going to do with 20000 documents for example? Are they consuming those documents in another tool? Or display all the documents on the result page? Are they going to paginate over the resultset?

santhosh.linga · December 19, 2023, 7:35pm

We have the pagination feature. We don't display more than 500 records per page. The default page size is 10.

dadoonet · December 19, 2023, 9:04pm

So how a user navigates to page 9990? By chance?
Or does he click 9990 times on the next button?

santhosh.linga · December 19, 2023, 9:27pm

I am sorry to say this, but yes we are using pagination to navigate.

With that, we are trying to implement to receive top 10k results.

Get the Filters for top 10K
Perform Sorting if required for only the top 10K

Can we achieve this?

dadoonet · December 19, 2023, 10:00pm

I guess you can sort on the client side.

santhosh.linga · December 19, 2023, 10:15pm

Thank you. How can I achieve the filters?

dadoonet · December 19, 2023, 10:30pm

I guess as mentioned here: Max Window Size is Set to 10000 but the Terms aggregations is giving single filter value larger than max window size. - #2 by yago82

If it does not work for you, please provide a reproduction script from where we can iterate.

system · January 16, 2024, 10:30pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pulling more than 10000 records from elasticsearch query Elasticsearch	2	31659	June 11, 2019
Max limit for number of search results Elasticsearch	2	7986	May 4, 2023
Displaying top 1000 results only in elastic search Elasticsearch	6	7628	July 5, 2017
Impact of increasing index.max_result_window size to 10 million? Elasticsearch	3	3429	August 1, 2021
Getting only 10 search results though I am getting more than 20000 hits Elasticsearch	2	564	June 5, 2019

Max Window Size is Set to 10000 but the Terms aggregations is giving single filter value larger than max window size.

Related topics