Elasticsearch: empty slices when using scroll api with slice

I have a simple setup with a green cluster (v7.5.2) of 1 instance with 1 index (replica 0) with 8 shards.
61,500 documents indexed.

If I create 8 slices with these queries (POST queries)

http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 0, 'max': 8}, 'size': 1000}
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 1, 'max': 8}, 'size': 1000}
...
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 7, 'max': 8}, 'size': 1000}

for each slice I collect the first hits of each slice and get those lengths:
[1000, 0, 0, 0, 0, 0, 0, 0]

Only 1 slice has results which is equivalent to not slicing the index in the first place.

I tried a max of 32 with those commands:

http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 0, 'max': 32}, 'size': 1000}
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 1, 'max': 32}, 'size': 1000}
...
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 31, 'max': 32}, 'size': 1000}

for each slice I collect the first hits of each slice and get those lengths:
[1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0]

This way I can scroll my index using 4 different (non-empty) slices but I had to create 32 contexts which is not ideal.

What am I doing wrong?

I tried to use a date as slice.field

"slice": {
        "field": "created_at",

but it did not help.

My local index had so few documents, most of the documents were saved in 1 shard

@PierreC
There are two issues.

  1. How to use scroll if all documents are in a single shard.
  2. Why all documents went to a single shard.

For #1 add preference parameter to the url to limit search to a single shard

http://localhost:9202/products_dev/_search?scroll=1m&preference=_shards:0

This along with slice will allow you to query without 0 size slices. Instead of 32 you will be able to use 4 slices.

For #2 Number of documents is not the reason. This can happen if you pass _id for each document during indexing and all _ids have same hash value. This is very uncommon. The other possibility is routing was used during indexing with identical routing value for all docs.

If routing was used, you can add routing parameter in the url instead of preference.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.