How Sliced Scan Works in the background?

arkash20 · May 31, 2021, 10:53am

Hi,

I used scan method to retrieve data from a huge indice we have (200M records), using multiprocessing. For that I used sliced scan and set it to 6 slices, as the number of my shards.
Now here comes the theory questions which I don't understand, and couldn't find clear answer googling it.

I have 6 shards and set the ran to 6 slices. while running I checked elastic and saw that parameter open-context is 36. I assume 36 is the number of scrolls. if so, what is the connection between slicing and the scrolls? does each scroll stored in the RAM of the node?
Is there some calculation to check what is the optimal number for slicing on my cluster and for my indice?

we have 2 data nodes, 200M records, 6 shards.

reading the docs, I found a formula which shows how a slice assigned to a shard, but still couldn'y understand it.

a slice can be assigned to multiple shards?
a slice can have multiple scrolls?

these are the thing I get confused about and will gladly hear your explanation about it.

thanks in advance

system · June 28, 2021, 10:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How Sliced Scan Works in the background? Elasticsearch	0	283	June 3, 2021
Elasticsearch: empty slices when using scroll api with slice Elasticsearch	2	894	July 10, 2020
Empty Slices with Scan/Scroll Elasticsearch es-hadoop	1	1366	April 9, 2018
Sliced Scroll Search question, what's the minimum max slice? Elasticsearch	2	1538	August 21, 2017
Question on scroll, routing, and slicing combination Elasticsearch	0	848	June 9, 2017

How Sliced Scan Works in the background?

Related topics