A count request shows that i can expect 31357 documents for a query.
My assumption is that this will fit in 4 slices of 10000 documents.
So i start with a request for a pit:
POST /myindex/_pit?keep_alive=5m
which returns a pit, eg: w8abBAERb3BlbnJkdy...nN3AAA=
Then i do 4 subsequent slicing point-in-time query with an increasing slice.id from 0 to 3 and slice.max = 4
POST /_search
{
"slice": { "id": "0", "max": 4 },
"pit": {
"id":"w8abBAERb3BlbnJkdy...nN3AAA="
},
"size":10000,
"query": { ... }
}
POST /_search
{
"slice": { "id": "1", "max": 4 },
"pit": {
"id":"w8abBAERb3BlbnJkdy...nN3AAA="
},
"size":10000,
"query": { ... }
}
POST /_search
{
"slice": { "id": "2", "max": 4 },
"pit": {
"id":"w8abBAERb3BlbnJkdy...nN3AAA="
},
"size":10000,
"query": { ... }
}
POST /_search
{
"slice": { "id": "3", "max": 4 },
"pit": {
"id":"w8abBAERb3BlbnJkdy...nN3AAA="
},
"size":10000,
"query": { ... }
}
slice.id=0 returns 8145 rows
slice.id=1 returns 10000 rows
slice.id=2 returns 201 rows
slice.id=3 returns 5574 rows
Total 23920 rows but i expected 31357 rows
When i increase the slice.max and query all slices it will return more documents.
When slice.max = 10 or higher, it will return all rows (and each slice has less than 10000 documents)
The slice.max must be known before the first slicing point-in-time query
My question is how to calculate the best number of slices and get all documents?