Elastic cluster: docker or VM for production?

We have to use 3 old servers as Elasticsearch cluster. Each server is

CPU(s): 32
Thread(s) per core: 2
Model name: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Memory: 94GB
Disk: 4x1,5TB SSD + 25TB HDD

Could you advice the best practice how to utilise the most performance and reliability out of these 3 boxes (and may be more are comming in future)

We were thinking of having 3 docker containers on each physical node = total 9 elastic nodes in elastic cluster. But I have issue that nodes on different hosts do not see each other. (detail code below - any help appretiated)

However , is it a good idea to use Docker for production wouldn't be better to use each physical node as Elastic node? Or shall we split the host by VMvare into multiple VM? What would be the best recommended approach?
Thank you for yor advices. .

3 docker containers started on first physical host works OK (estabilished a cluster of 3 nodes) but we are unable to start another docker container on second host . The docker is started but unable to join cluster on 1st host. I havent found much threads resolving this issue.

this is the code we run the cluster on first host (and it is working) but 4th node (node03) on second host we cant configure to estabilish connection with cluster on host1. Trying overlay network etc,,..

docker run -d --restart=always -p 9200:9200 -p 9300:9300 --name es-node0 --network es-net --cpuset-cpus="0-7" --cap-add=IPC_LOCK --ulimit nofile=65536:65536 --ulimit memlock=-1:-1 -v /mnt/ssd1/esdata1:/usr/share/elasticsearch/data -e cluster.name=docker-cluster -e bootstrap.memory_lock=true -e http.cors.enabled=true -e http.cors.allow-origin=* -e "ES_JAVA_OPTS=-Xms24g -Xmx24g" -e Des.bootstrap.mlockall=true -e Des.network.host=bond0 -e Des.discovery.zen.ping.multicast.enabled=false elasticsearch:6.8.0

docker run -d --restart=always -p 9201:9200 -p 9301:9300 --name es-node1 --network es-net --cpuset-cpus="8-15" --cap-add=IPC_LOCK --ulimit nofile=65536:65536 --ulimit memlock=-1:-1 -v /mnt/ssd2/esdata2:/usr/share/elasticsearch/data -e cluster.name=docker-cluster -e bootstrap.memory_lock=true -e http.cors.enabled=true -e http.cors.allow-origin=* -e "ES_JAVA_OPTS=-Xms24g -Xmx24g" -e Des.bootstrap.mlockall=true -e Des.network.host=bond0 -e Des.discovery.zen.ping.multicast.enabled=false -e discovery.zen.ping.unicast.hosts="es-node0" elasticsearch:6.8.0

docker run -d --restart=always -p 9202:9200 -p 9302:9300 --name es-node2 --
network es-net --cpuset-cpus="24-31" --cap-add=IPC_LOCK --ulimit nofile=65536:65536 --ulimit memlock=-1:-1 -v /mnt/ssd3/esdata3:/usr/share/elasticsearch/data -e cluster.name=docker-cluster -e bootstrap.memory_lock=true -e http.cors.enabled=true -e http.cors.allow-origin=* -e "ES_JAVA_OPTS=-Xms24g -Xmx24g" -e Des.bootstrap.mlockall=true -e Des.network.host=bond0 -e Des.discovery.zen.ping.multicast.enabled=false -e discovery.zen.ping.unicast.hosts="es-node0" elasticsearch:6.8.0

docker run -d --restart=always -p 9203:9200 -p 9303:9300 --name es-node3 --network es-net --cpuset-cpus="8-15" --cap-add=IPC_LOCK --ulimit nofile=65536:65536 --ulimit memlock=-1:-1 -v /mnt/ssd1/esdata1:/usr/share/elasticsearch/data -e cluster.name=docker-cluster -e bootstrap.memory_lock=true -e http.cors.enabled=true -e http.cors.allow-origin=* -e "ES_JAVA_OPTS=-Xms24g -Xmx24g" -e Des.bootstrap.mlockall=true -e Des.network.host=bond0 -e Des.discovery.zen.ping.multicast.enabled=false -e discovery.zen.ping.unicast.hosts="10.2.46.21:9300" elasticsearch:6.8.0

Both Docker and VMs are viable architectures (assuming you can work out the bugs in your Docker network config) but why not just run multiple Elasticsearch nodes directly on each host? Docker gives you stronger isolation for cases where you need it, but I'm wondering why you need it here on machines that are dedicated to running Elasticsearch.

You do not have enough memory to run three Elasticsearch nodes each with a 24GB heap, because you must set the heap size to no more than 50% of the total available RAM for each node. With 94GB you should configure no more than 15GB for your heap size. 15GB is ok for 3 nodes because 15*3 = 45GB which is just less than half of your total RAM.

1 Like

@DavidTurner
Great thank you, I did not know the option of having 2 instances on 1 host with no-docker. I have concern about one binary for two configurations and potential problems during upgrades. The hardware we have is not symetric (it has different type of CPU, RAM, disks) and it was the reason why we decided for docker where we can group specific CPU for each container (--cpuset-cpus="0-7").
Do you have a guideline for setup of docker cluster? Shall we use docker swarm? Or what is the best practice of setting the docker cluster.
Is 3 docker in one host valid option for production?
We definitely should change memory setting, good comment. For 96G host we will create 3 nodes each 32G, so 16G Heap for each node.

Using the same binaries for multiple nodes is a viable configuration. You just have to set the ES_PATH_CONF environment variable differently to run nodes with different configs. Upgrades should not be hard to orchestrate in that setup. For instance each version of the .tar.gz distribution unpacks into a different directory. Docker is not required for pinning things to specific CPUs either - you can use the taskset command to do that to bare processes.

I really recommend staying away from running your production cluster in Docker until you're much more comfortable running things in Docker. I don't know of any Docker Swarm integration for Elasticsearch, but there is Kubernetes integration. I would say that setting up a Kubernetes environment would be overkill for your situation.

1 Like

@DavidTurner thank you
We decided to stay with docker since all other components in this solution are also in docker. We finally were able to estabilish the cluster of 6 nodes across 3 hosts. But we may switch to you proposed solution after I complete tests of this setup.
Do you have any experience with performance docker/centos installation what would you consider as a main plus/minuses ?
Shall we benchmark it with rally?

Finally 2 nodes on each host I was able to configure and connect into cluster using network_mode: host

user@host3:~/es-node31$ cat docker-compose.yml
version: '2.2'
services:
es-node31:
image: docker.elastic.co/elasticsearch/elasticsearch:6.8.0
container_name: es-node31
environment:
- node.name=es-node31
- "discovery.zen.ping.unicast.hosts=host2.domain.int,host3.domain.int"
- cluster.name=docker-cluster
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms24g -Xmx24g"
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
cpuset: "0-11"
network_mode: host

Ok, sounds good. It wasn't clear to me that you had a more extensive Docker estate.

I don't know of any performance differences between Docker and non-Docker installations. The biggest benefits of Docker for me are isolation and orchestration when running lots of clusters, and the biggest drawback is all the extra complexity that it introduces over simply running bare processes.

1 Like

the last issue I have is disk asignment , with non-Docker installation it is all much easier for me, I entered this docker world recently. So I created separate task,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.