I am currently operating the elastic-agent-complete container within a single-node Docker environment. Due to the critical nature and volume of monitoring we perform, it is imperative that we transition this deployment to a high-availability (HA) architecture to ensure continuity and resilience. Loss of monitoring capabilities is not an acceptable risk.
After evaluating potential solutions, I have identified two primary options for achieving high availability:
-
Docker Swarm
Docker Swarm offers native support for container orchestration and is well-suited for running images likeelastic-agent-complete, which closely resemble full Linux distributions. Its simplicity and compatibility with Docker-based workflows make it a viable candidate. -
Kubernetes
Kubernetes provides a robust and scalable orchestration platform with advanced features for managing containerized workloads. While it introduces additional complexity, it may offer greater flexibility and resilience in the long term.
An alternative approach worth considering is the deployment of multiple dedicated Synthetics servers. However, I have not yet fully explored the operational implications of this model—specifically, whether it supports automatic load balancing and how it handles node failures in relation to monitor continuity.
At this stage, I am seeking guidance from those who have implemented similar solutions. I would appreciate insights into the trade-offs between Docker Swarm and Kubernetes in this context, as well as any experiences with scaling Synthetics infrastructure for high availability.