What are differences between search_after and range filter?

Is there any differences in performance or traversal logic when we use search_after or range filter?

For example, this query:

GET twitter/_search
{
    "size": 10,
    "query": {
        "range" : {
            "my_id" : {
                "gt" : 1234
            }
        }
    },
    "sort": [
        {"my_id": "asc"}
    ]
}

vs this query:

GET twitter/_search
{
    "size": 10,
    "search_after": [1234],
    "sort": [
        {"my_id": "asc"}
    ]
}

Are there any differences or they will perform equally?

Best regards

3 Likes

Those two things are something completely different and cannot be compared.

A range query allows you to filter for documents where a field value is inside of a certain range. It allows you to reduce the total number of documents being returned.

The search after functionality allows you to do deep pagination with much less resource than specifying from/size parameters,as the search after parameter is basically a pointer where to start the search for the documents that are about to be returned. Consider it a state, that is offloaded to the client if you want, it will however not change the result set, just where the search starts.

See https://www.elastic.co/guide/en/elasticsearch/reference/6.3/search-request-search-after.html

@spinscale Thank you.
I have read the documentation.
The thing is that search_after is using last data from sorted fields. And basically it looks like search_after is just a little bit more "convenient" method to use range filter with gt.
I.e. logically both search_after and range with gt are preventing deep pagination problem.
The only differences I see are:

  1. search_after looks better than query -> range -> gt.
  2. If you are using range -> gt you have to use sorted fields of the last returned data to query next page. If you are using search_after then sorted fields of the last returned data is automatically returned in search_after parameter.

That is why I compare those two methods (range filter and search_after parameter for pagination). That is why I asked this question. Is there any backend (logical or algorithmic) difference for pagination when we are using search_after or range?

I think that a lot of people were using scroll for pagination and they didn't know how to scroll without it. Possibly, Elasticsearch team was always asked a question "How to paginate without storing a context?" and Elasticsearch decided to make search_after parameter to hide range usage because it's not very clear that range can be used for deep pagination. But again, it's just my assumption. I didn't find any information which explains why search_after was created or how is it different to range query.

As I understand resource and performance will be equal for for both search_after and range when we use them for pagination.

3 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.