We consider running Elasticsearch under Docker Swarm 1.13. At first look, this sounds like a natural approach. However, there are few points that concern me, mainly related to the fact Swarm will always replicate services to the desired amount.
Assuming my Elasticsearch keep all data on local disk, I have two options: mount volume from the host filesystem, or use container 'internal' volume without mounting any host filesystem to the container.
The two options behave differently under Swarm:
When using 'internal' volume will result loosing all my data any time the task (i.e. Single Elasticsearch container in the Elasticsearch cluster) is down, and Swarm runs another task instead of the faulty one. This will also leave large dangling volumes on my disk
When using host mounted volume, I will also loose the data in case the task is down and Swarm raise new task elsewhere. But now, if for any reason Elasticsearch task will run again on the same node, it will find deprecated data volume. What will happen now?
In both options, Elasticsearch will replicated all my data across the cluster. So I guess I have to really ensure the task is down and that it is not recoverable.
Eventually, I need to use some dirty tricks in order to have full control over my data in the cluster.
Does any one have any experience with running Elasticsearch on Swarm? Any rule of thumb or guidelines?