Unable to paginate a PIT search without a sort clause

Hello,

I am trying to paginate a search but I am unable to do so because results don't include the sort field to search after. However, from what I understand reading the search_after documentation Paginate search results | Elasticsearch Guide [7.12] | Elastic the tie breaker sort field should have been added implicitly.

All PIT search requests add an implicit sort tiebreaker field called _shard_doc, which can also be provided explicitly.

I do the following actions on a 7.12 one-node cluster. Is there something I do wrong ? Or I misunderstood that part of the documentation ?

I create an index with 2 documents:

POST /post/_doc
{
   "author": "alice dupont",
   "title": "Let's go to Mars"
}
POST /post/_doc
{
   "author": "bob benkhen",
   "title": "Let's go to the Moon"
}

I create a PIT on the index:

POST /post/_pit?keep_alive=2m

If I make a search using the PIT, I get a hit without a sort field:

{
   "pit": {
      "id": "PIT"
   },
   "size": 1
}
{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 1,
        "total": 1
    },
    "hits": {
        "hits": [
            {
                "_id": "2Ff1F3kBfAqWnfewA6Um",
                "_index": "post",
                "_score": 1.0,
                "_source": {
                    "author": "alice dupont",
                    "title": "Let's go to Mars"
                },
                "_type": "_doc"
            }
        ],
        "max_score": 1.0,
        "total": {
            "relation": "eq",
            "value": 2
        }
    },
    "pit_id": "PIT",
    "timed_out": false,
    "took": 1
}

However, if I add the tie breaker sort field explicitly, I do have the sort field as part of the hits. My understanding was that this sort field would be added by Elasticsearch automatically.

GET _search
{
   "pit": {
      "id": "PIT"
   },
   "size": 1,
   "sort": {
      "_shard_doc": "asc"
   }
}
{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 1,
        "total": 1
    },
    "hits": {
        "hits": [
            {
                "_id": "2Ff1F3kBfAqWnfewA6Um",
                "_index": "post",
                "_score": null,
                "_source": {
                    "author": "alice dupont",
                    "title": "Let's go to Mars"
                },
                "_type": "_doc",
                "sort": [
                    0
                ]
            }
        ],
        "max_score": null,
        "total": {
            "relation": "eq",
            "value": 2
        }
    },
    "pit_id": "PIT",
    "timed_out": false,
    "took": 1
}

My understanding was that this sort field would be added by Elasticsearch automatically.

It is added automatically for sort queries with PIT. For example, if you were sorting by some field <my_field>, then in hits "sort" part, a value for _shard_doc would be added in addition to <my_field> values.

But since, in your initial query, you didn't sort by any field, you need to provide _shard_doc in sort explicitly.

Hi Mayya,

Just to better understand, what is the difference between the two queries below ? In the first search, I think I would get hits in index order, while in the other I would get hits grouped by shards. The fact that _shard_doc isn't always added is because of this unnecessary grouping ?

GET _search
{
   "pit": {
      "id": "PIT"
    }
}
GET _search
{
   "pit": {
      "id": "PIT"
   },
   "sort": {
      "_shard_doc": "asc"
   }
}

Thanks

There is no difference between these queries if you just want to retrieve first top results – the results will be output in exactly same order. _shard_doc consists of 1) shard Id and an 2) internal doc ID, and this is also the default way how hits are sorted by a coordinating node (if there are no scoring queries, or other sorting criteria).

But if you want to retrieve subsequent pages, you need to provide a unique search_after parameter, and that's what using _shard_doc will give you – a unique tie-breaker to get consistent pagination.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.