How Sliced Scan Works in the background?

arkash20 · June 3, 2021, 8:23am

Hi,

I used scan method to retrieve data from a huge indice we have (200M records), using multiprocessing. For that I used sliced scan and set it to 6 slices, as the number of my shards.
Now here comes the theory questions which I don't understand, and couldn't find clear answer googling it.

I have 6 shards and set the ran to 6 slices. while running I checked elastic and saw that parameter open-context is 36. I assume 36 is the number of scrolls. if so, what is the connection between slicing and the scrolls? does each scroll stored in the RAM of the node?
Is there some calculation to check what is the optimal number for slicing on my cluster and for my indice?

we have 2 data nodes, 200M records, 6 shards.

reading the docs, I found a formula which shows how a slice assigned to a shard, but still couldn'y understand it.

a slice can be assigned to multiple shards?
a slice can have multiple scrolls?

these are the thing I get confused about and will gladly hear your explanation about it.

thanks in advance

system · July 1, 2021, 8:24am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How Sliced Scan Works in the background? Elasticsearch	1	431	June 28, 2021
Elasticsearch: empty slices when using scroll api with slice Elasticsearch	3	867	August 7, 2020
Empty Slices with Scan/Scroll Elasticsearch es-hadoop	2	1343	May 7, 2018
Sliced Scroll Search question, what's the minimum max slice? Elasticsearch	3	1510	September 18, 2017
Question on scroll, routing, and slicing combination Elasticsearch	1	834	July 7, 2017

How Sliced Scan Works in the background?

Related topics