I used scan method to retrieve data from a huge indice we have (200M records), using multiprocessing. For that I used sliced scan and set it to 6 slices, as the number of my shards.
Now here comes the theory questions which I don't understand, and couldn't find clear answer googling it.
I have 6 shards and set the ran to 6 slices. while running I checked elastic and saw that parameter open-context is 36. I assume 36 is the number of scrolls. if so, what is the connection between slicing and the scrolls? does each scroll stored in the RAM of the node?
Is there some calculation to check what is the optimal number for slicing on my cluster and for my indice?
we have 2 data nodes, 200M records, 6 shards.
reading the docs, I found a formula which shows how a slice assigned to a shard, but still couldn'y understand it.
- a slice can be assigned to multiple shards?
- a slice can have multiple scrolls?
these are the thing I get confused about and will gladly hear your explanation about it.
thanks in advance