Elasticsearch: empty slices when using scroll api with slice

PierreC · July 2, 2020, 7:55am

I have a simple setup with a green cluster (v7.5.2) of 1 instance with 1 index (replica 0) with 8 shards.
61,500 documents indexed.

If I create 8 slices with these queries (POST queries)

http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 0, 'max': 8}, 'size': 1000}
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 1, 'max': 8}, 'size': 1000}
...
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 7, 'max': 8}, 'size': 1000}

for each slice I collect the first hits of each slice and get those lengths:
[1000, 0, 0, 0, 0, 0, 0, 0]

Only 1 slice has results which is equivalent to not slicing the index in the first place.

I tried a max of 32 with those commands:

http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 0, 'max': 32}, 'size': 1000}
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 1, 'max': 32}, 'size': 1000}
...
http://localhost:9202/products_dev/_search?scroll=10m: {'slice': {'field': 'created_at', 'id': 31, 'max': 32}, 'size': 1000}

for each slice I collect the first hits of each slice and get those lengths:
[1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0]

This way I can scroll my index using 4 different (non-empty) slices but I had to create 32 contexts which is not ideal.

What am I doing wrong?

I tried to use a date as slice.field

"slice": {
        "field": "created_at",

but it did not help.

PierreC · July 9, 2020, 12:02pm

My local index had so few documents, most of the documents were saved in 1 shard

Vinayak_Sapre · July 10, 2020, 5:00am

@PierreC
There are two issues.

How to use scroll if all documents are in a single shard.
Why all documents went to a single shard.

For #1 add preference parameter to the url to limit search to a single shard

http://localhost:9202/products_dev/_search?scroll=1m&preference=_shards:0

This along with slice will allow you to query without 0 size slices. Instead of 32 you will be able to use 4 slices.

For #2 Number of documents is not the reason. This can happen if you pass _id for each document during indexing and all _ids have same hash value. This is very uncommon. The other possibility is routing was used during indexing with identical routing value for all docs.

If routing was used, you can add routing parameter in the url instead of preference.

system · August 7, 2020, 5:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sliced Scroll Search question, what's the minimum max slice? Elasticsearch	3	1437	September 18, 2017
Empty Slices with Scan/Scroll Elasticsearch es-hadoop	2	1313	May 7, 2018
Scroll query with slice Elasticsearch	2	413	October 6, 2022
Scrolling or slicing? Elasticsearch	5	1788	April 27, 2017
Sliced scroll returning more hits than normal search (without slice) Elasticsearch	4	578	October 19, 2022

Elasticsearch: empty slices when using scroll api with slice

Related topics