Sliced scroll returning more hits than normal search (without slice)

In short, normal search hits < scroll slice 1 hits + scroll slice 2 hits

When I add up sliced scroll hits, it is more than total no of documents returned from single search.

Normal search query

GET index*/_search
{
  "track_total_hits": true,
  "sort": [
    {
      "@timestamp": {
        "order": "asc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "_source": false,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2022-07-31T18:30:00.000Z",
              "lte": "2022-08-30T18:30:00.000Z"
            }
          }
        }
.....

Normal Search Response:

{
  "took" : 1055,
  "timed_out" : false,
  "_shards" : {
    "total" : 455,
    "successful" : 455,
    "skipped" : 290,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 435743,
      "relation" : "eq"
    },
    "max_score" : null,

Slice 1

GET index*/_search
{ 
 "slice": {
    "id": 0,
    "max": 2
  },
  "track_total_hits": true,
  "sort": [
    {
      "@timestamp": {
        "order": "asc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "_source": false,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2022-07-31T18:30:00.000Z",
              "lte": "2022-08-30T18:30:00.000Z"
            }
          }
        }
.....

Slice 1 Response:

  "_shards" : {
    "total" : 455,
    "successful" : 455,
    "skipped" : 290,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 213954,
      "relation" : "eq"
    },
    "max_score" : null,

Slice 2

GET index*/_search
{ 
 "slice": {
    "id": 1,
    "max": 2
  },
  "track_total_hits": true,
  "sort": [
    {
      "@timestamp": {
        "order": "asc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "_source": false,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2022-07-31T18:30:00.000Z",
              "lte": "2022-08-30T18:30:00.000Z"
            }
          }
        }
.....

Slice 2 Response:

  "_shards" : {
    "total" : 455,
    "successful" : 455,
    "skipped" : 292,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 221884,
      "relation" : "eq"
    },
    "max_score" : null,

So as we can see,

total hits for slices = 213954 + 221884 = 435838 which is greater than 435743 (hits for normal search).

Can someone explain why is it behaving like this?

FYI, data is not being inserted/deleted. I am querying multiple indexes (index1, index2 ...) in this example.

Version: 7.16.3

Can anyone help? This could be a bug.

It could. Or may be you index is not "stable" and you have been injected new documents after the first search?

No, that is not the case. I can run it again and count for the normal search is same 435743 as when I first ran it few days ago.

Count values for 2 slices remains the same as when I first ran it i.e. 213954 + 221884 = 435838 but is different than normal search count total 435743.

Different slice values gives different total count like for 3 slices count is 146602 + 155805 + 143114 = 445521.

Is this due to the value used for sort?

FYI, this search is made to many indexes as you can see from shards.total. For single index, slice is behaving fine.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.