How to calculate the number of slices for a search slicing point-in-time query?

peter8 · June 2, 2025, 2:31pm

I have an index with aprox. 1.5M documents and 1 primary shard. The count function for my query shows that I can expect 31357 documents.

When I do the search slicing point-in-time queries for 9 (or less) slices and maximum 10000 documents per slice, I end up with missing documents. (29086 total). I noticed that slice 3 has 10000 documents and the other slices have less than 10000 documents.

The slicing point-in-time query for 10 (or more) slices gives me all 31357 documents. The individual slices have less than 10000 documents.

Is there a formula, so that can calculate the best number of slices to get all documents?

Tortoise · June 4, 2025, 12:21pm

Hello @peter8 ,

Welcome to the community!!

I tried below way :

GET kibana_sample_data_logs/_count
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2025-06-03T00:00:00Z",
        "lte": "2025-06-04T00:00:00Z"
      }
    }
  }
}

Output : "count": 232

POST /kibana_sample_data_logs/_search
{
  "size": 0,
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2025-06-03T00:00:00Z",
        "lte": "2025-06-04T00:00:00Z"
      }
    }
  },
  "aggs": {
    "terms_count_per_day": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "1d"
      }
      }
    }
  }

**Output snippet :** 

   "hits": {
    "total": {
      "value": 232,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "terms_count_per_day": {
      "buckets": [
        {
          "key_as_string": "2025-06-03T00:00:00.000Z",
          "key": 1748908800000,
          "doc_count": 230
        },
        {
          "key_as_string": "2025-06-04T00:00:00.000Z",
          "key": 1748995200000,
          "doc_count": 2
        }
      ]
    }

Now since you want all records & do not want to guess the number of slice , i am not sure if this way it can be considered like range we take fixed as 1day/1week/1month depending upon the available data.

If i have misunderstood your requirement, please share more details.

Thanks!!

peter8 · June 5, 2025, 7:05am

A count request shows that i can expect 31357 documents for a query.
My assumption is that this will fit in 4 slices of 10000 documents.

So i start with a request for a pit:

POST /myindex/_pit?keep_alive=5m

which returns a pit, eg: w8abBAERb3BlbnJkdy...nN3AAA=

Then i do 4 subsequent slicing point-in-time query with an increasing slice.id from 0 to 3 and slice.max = 4

POST /_search
{
  "slice": { "id": "0", "max": 4 },
  "pit": {
    "id":"w8abBAERb3BlbnJkdy...nN3AAA="
  },
  "size":10000,
  "query": { ...  }
}

POST /_search
{
  "slice": { "id": "1", "max": 4 },
  "pit": {
    "id":"w8abBAERb3BlbnJkdy...nN3AAA="
  },
  "size":10000,
  "query": { ...  }
}

POST /_search
{
  "slice": { "id": "2", "max": 4 },
  "pit": {
    "id":"w8abBAERb3BlbnJkdy...nN3AAA="
  },
  "size":10000,
  "query": { ...  }
}

POST /_search
{
  "slice": { "id": "3", "max": 4 },
  "pit": {
    "id":"w8abBAERb3BlbnJkdy...nN3AAA="
  },
  "size":10000,
  "query": { ...  }
}

slice.id=0 returns 8145 rows
slice.id=1 returns 10000 rows
slice.id=2 returns 201 rows
slice.id=3 returns 5574 rows
Total 23920 rows but i expected 31357 rows

When i increase the slice.max and query all slices it will return more documents.
When slice.max = 10 or higher, it will return all rows (and each slice has less than 10000 documents)

The slice.max must be known before the first slicing point-in-time query
My question is how to calculate the best number of slices and get all documents?

peter8 · June 10, 2025, 12:36pm

My guess is that the number of slices depends mainly on the number of documents (d) in the index (not the number of documents of the query) and must be a multiple of the number of shards (s) in the index. Both can be obtained by a count request.
In that case the formula is: slices = s * (roundUp(d / 10_000) / s)

Topic		Replies	Views
Sliced Scroll Search question, what's the minimum max slice? Elasticsearch	3	1510	September 18, 2017
Fetch 200M documents with slice and scroll Elasticsearch	1	489	March 13, 2018
Elasticsearch: empty slices when using scroll api with slice Elasticsearch	3	867	August 7, 2020
Paging through 700k+ document index Elasticsearch	3	352	July 6, 2017
How much resource consuming is to get count only from ES ? What are the ways to just count the total documents? Elasticsearch	8	549	July 6, 2017

How to calculate the number of slices for a search slicing point-in-time query?

Related topics