Are there recommended procedures for rolling restarts of ECE runners in cases of system level patching or hardware maintenance?
- Zone by zone of course
- Do the control plan first of all, zookeeper first
- Before taking down a ZK node, check that you have exactly 3 or 5 containers and they are all connected (eg there is a ZK status element under settings)
- If you have sufficient allocator space, it is recommended that you put each allocator to patch in maintenance mode first, then migrate clusters off it, then take it down etc
- (of course in many cases there is insufficient space to do this, in which case - ensure all non-HA clusters are migrated off and be aware that you will lose HA during the rolling change)
1.11 at least has a broken daemon restart, so the recommended way of bringing a host up without running ECE is as follows:
- Disable the docker daemon (don't stop it) - exact command depends on OS
- Reboot the host to bring it up without ECE
- Perform whatever changes are required
- Re-enable the docker daemon once done
- Reboot the host to bring it back with ECE running
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.