hello! i have a cluster with 20 data nodes, 3 masters and 2 ml.

We want to do a rolling restart so that there is no loss of service. i know the method to follow....

disable shard allocation, temporarily stop ml jobs, restart node per node, etc....

but I have the doubt if when restarting the chosen master node I will have loss of service or the high availability of the 3 nodes avoids this and how.

If you do a rolling restart, in normal conditions during master elections nodes will serve reads requests and indexing requests will wait (but not return) for a new master to be elected. Master election takes between 3 to 5 seconds. If you have a test cluster, you could run a test and see how this happens.

If I understand you correctly, there is no cut because the requests, indexing etc. are not rejected, they remain on hold and therefore it would not be a cut but a small delay of 3 seconds. Is it like that ?

Anyway, as you said, I'll check before in dev.

